I would like to calculate the read depth of my raw sequences resieved from company, somebody suggest my
"Get number of reads, multiply it by read length, divide by size of genome. If you have RNASeq, you need to know the estimate of the transcribed proportion of the genome."
The concept of "read depth" isn't applicable to RNAseq, since it doesn't actually tell you much useful. This term is useful in calling variants, since you expect them to be more or less equally covered...so a single metric, like read depth, can convey useful information. For RNAseq, you expect there to be vast differences in coverage between genes/isoform/whatever, so there's no single number that could convey anything useful in that regard. If you must give a number for some reason, just give the number of mapped reads above some MAPQ threshold. Alternatively, do as Irsan and just use featureCounts or htseq-count and then attach the resulting count table as a supplemental file (afterall, this contains the actual useful information).
The cumulative non-redundant size of the human transcriptome varies from cell to cell and depends on how much evidence you find enough to consider something to be transcribed. Why don't you assemble the transcriptome of your own samples and calculate the size yourself? Or even better if you are just interested in read depth, use dedicated tools that calculate read depth from rna seq bam files directly.
I would like to know is the company perform the correct sequencing for me and is the read depth is as same as our requested or not. Can you suggest me how can I calculate the read depth from row rna seq? is there any tools?
Thank you
ADD REPLY
• link
updated 2.9 years ago by
Ram
44k
•
written 10.2 years ago by
hana
▴
190
0
Entering edit mode
Did they actually guarantee a specific coverage number rather than some minimum number of reads with some given quality? The latter makes sense, the former doesn't.
As Devon Ryan says, it makes more sense to talk about the number of reads that is provided per sample and the % that can be mapped uniquely on the genome/transcriptome.
The transcriptome of a given cell-type/tissue or in general? That latter can be obtained from a GTF file. The former would combine that with some RNAseq (or microarray) data.
What do you need to knwo the read depth for? Do you want to call variants?