On my lab, we performed an RNA Seq experiment of which I don't know the coverage. I have the library size as from the FASTQC Report (total sequences: ex: 9399148). Is it possible from this number to know the coverage?
Thank you in advance and sorry for the stupid question but I'm a newbie in RNA Seq analysis
It is not possible to estimate coverage (except for a gross calculation: (number of reads * sequencing cycles)/known genome size) based on the number of reads. You will have to do alignments and/or an assembly (followed by alignments) to estimate coverage (@Amitm has already covered the basics).
hi,
If you use RNA-SeQC, then you can get, along with other useful metrics, coverage info as well. You need to have the mapped BAM file of course.
Depending on what you mean by coverage: average fraction of exons covered by reads, or average number of reads mapping to exons, you can use other programs as well.
HTSeq-count can be used to get read counts for genes which can then be used for avg. calc..
Bedtools lacks the custom read assigning feature of HTSeq-count, but you can use it to get avg. fraction of exons covered by reads.
There are other programs as well, like featureCounts.
Coverage is not really the most important parameter in RNA-seq, since your reads will always be non-uniform distributed over highly and lowly expressed genes. At best you could calculate an average coverage, which is meaningless. What you do care about is the library size, and that you already learned from the FastQC report.