STAR mapped more reads to transcriptome

0

Entering edit mode

7.4 years ago

tianleivv ▴ 50

Hi,

I am following ENCODE pipeline to analyze my RNA-Seq data. However, STAR gives me more reads in the Aligned.toTranscriptome.out.bam compared to the Aligned.sortedByCoordinate.bam, which means it aligned more reads to transcriptome rather than genome. This is beyond my understanding. Did anyone meet this problem? Thanks a lot!

Best, Lei

RNA-Seq • 5.0k views

ADD COMMENT • link 7.4 years ago by tianleivv ▴ 50

1

Entering edit mode

It's hard to say w/o knowing specifically how you ran STAR. Maybe a difference in the total # of alignments (which may filter more reads in the genome alignment vs transcriptome)?

ADD REPLY • link 7.4 years ago by Chris Fields ★ 2.2k

0

Entering edit mode

Hi Chris, Thanks for your reply! I am actually following the ENCODE STAR+RSEM pipeline (https://github.com/ENCODE-DCC/long-rna-seq-pipeline/blob/master/DAC/STAR_RSEM.sh). I think no matter how I ran STAR, it should filter the reads equally for these mapped to transcriptome and genome. Thus, reads mapped to genome should always no less than reads mapped to transcriptome. Lei

ADD REPLY • link updated 7.4 years ago by h.mon 35k • written 7.4 years ago by tianleivv ▴ 50

1

Entering edit mode

It is still hard to say, because we don't know what are the numbers you are talking about. Is it about total counts to genes / transcripts provided by the two methods? Or is it information about number of mapped reads from STAR logs? Or did you run something like samtools flagstat?

Besides, you showed the script you run, but not how you run the script. What flags did you pass to the script?

ADD REPLY • link 7.4 years ago by h.mon 35k

Login before adding your answer.