Entering edit mode
7.4 years ago
tianleivv
▴
50
Hi,
I am following ENCODE pipeline to analyze my RNA-Seq data. However, STAR gives me more reads in the Aligned.toTranscriptome.out.bam compared to the Aligned.sortedByCoordinate.bam, which means it aligned more reads to transcriptome rather than genome. This is beyond my understanding. Did anyone meet this problem? Thanks a lot!
Best, Lei
It's hard to say w/o knowing specifically how you ran STAR. Maybe a difference in the total # of alignments (which may filter more reads in the genome alignment vs transcriptome)?
Hi Chris, Thanks for your reply! I am actually following the ENCODE STAR+RSEM pipeline (https://github.com/ENCODE-DCC/long-rna-seq-pipeline/blob/master/DAC/STAR_RSEM.sh). I think no matter how I ran STAR, it should filter the reads equally for these mapped to transcriptome and genome. Thus, reads mapped to genome should always no less than reads mapped to transcriptome. Lei
It is still hard to say, because we don't know what are the numbers you are talking about. Is it about total counts to genes / transcripts provided by the two methods? Or is it information about number of mapped reads from STAR logs? Or did you run something like
samtools flagstat
?Besides, you showed the script you run, but not how you run the script. What flags did you pass to the script?