After alignment small RNA sequence to the reference genome using bowtie. I got SAM file, then how to calculate read counts? What's the criterion? Is there software to do this work?
After alignment small RNA sequence to the reference genome using bowtie. I got SAM file, then how to calculate read counts? What's the criterion? Is there software to do this work?
To directly answer your question, HTseq count, featureCounts from the subread package, and GenomicRanges summarizeOverlaps()
. For example usage of the latter, see:
In your case, if you have a set of regions defining the miRNA locations on the genome, you can create a GRanges or GRangesList (see the rtracklayer package for how to import a BED file to a GRanges object) and then use summarizeOverlaps()
as described. Once you have a table of counts, the typical RNA-seq tools like DESeq2, edgeR, and limma voom can be applied.
You can use HTSeq for this,
And you can also do this on Fastq files with miRanalyzer2, groupReads from this link: http://bioinfo5.ugr.es/miRanalyzer/standalone.html#toc-Section-1
When converting read counts of small RNA into reads per million (RPM), how to define "Total_number_mapped_reads".
For example, in one of the mapped small RNA library
2601833 reads; of these: 2601833 (100.00%) were unpaired; of these:
352919 (13.56%) aligned 0 times 362282 (13.92%) aligned exactly 1 time 1886632 (72.51%) aligned >1 times
86.44% overall alignment rate
Total_number_mapped_reads= 362282 + 1886632 = 2248914
But the fact that some of the "1886632" reads are mapped multiple times, the actual Total_number_mapped_reads(location) will be higher.
To compute reads per million, which "Total_number_mapped_reads" should be used. The one which is got after multimapping or the actual number of mapped reads.
thanks !!
Thanks Sean !! I did it here Best/right way to quantify small RNA transcripts
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Are you interested in small RNAs in general or more specifically miRNAs? There are a lot of tools specifically meant to handle miRNAs, so if you're happy restricting yourself slightly to just those then just go with one of those tools (e.g., miRDeep2).
If you use Bowtie2, assure that you set the '-a' parameter. Otherwise you will only get one random hit reported, even if there are multiple mapping loci. small RNAs are known to occur in multiple copies (miRNAs, tRNAs, etc.).
Some more information about the problem: Attention: Bowtie2 And Multiple Hits