Hi all,
I am hoping to get some insight into what is happening here or any suggestions.
I am aligning total RNA-seq, single-end data to S. pombe with STAR and Tophat but getting two very different uniquely mapping statistics:
STAR:
Number of input reads | 20416529
Average input read length | 47
UNIQUE READS:
Uniquely mapped reads number | 2622002
Uniquely mapped reads % | 12.84%
Average mapped length | 48.83
Tophat
Reads:
Input : 20416529
Mapped : 19397908 (95.0% of input)
95.0% overall read mapping rate.
12.84 v 95% is a pretty big difference.
Any ideas?
Can you please post the command lines you used?
Because this is
total RNA-seq
most of your data is likelyrRNA
reads, which would likely be multi-mapping hence not counted as uniquely mapped by STAR.Please don't use TopHat for any current projects.
Hmm, this could be the case with rRNA contamination. Looking back the at library prep, there doesnt seem to be an rRNA depletion step.
If the prep was for total RNAseq then that is expected. If the prep was supposed to be for mRNAseq with ribo-depletion then ..