Question

What is the best way to handle unmapped reads from RNA-Seq data

3

Entering edit mode

9.6 years ago

GouthamAtla 12k

I have used tophat2 to map rna-seq reads to a draft genome. The alignment percentage is around 75-80% for all samples. When I take the unmapped reads and blast them, they hit the same organism, indicating the unmapped reads might have potential information. How do I deal with the unmapped reads and include them in DE analysis or any other downstream analysis ? Should I go with entirely different pipeline like trinity ?

RNA-Seq tophat2 trinity • 5.7k views

ADD COMMENT • link updated 2.4 years ago by Ram 44k • written 9.6 years ago by GouthamAtla 12k

0

Entering edit mode

You may want to look at this paper.

ADD REPLY • link updated 2.4 years ago by Ram 44k • written 9.6 years ago by Andrzej Zielezinski 11k

0

Entering edit mode

maybe allow a few more mismatches with tophat?

ADD REPLY • link 9.6 years ago by Martombo ★ 3.1k

0

Entering edit mode

thanks. but I thinks Its more about incomplete genome rather than alignment problem.

ADD REPLY • link 9.6 years ago by GouthamAtla 12k

1

Entering edit mode

9.6 years ago

Brian Bushnell 20k

I suggest using a more sensitive aligner (BBMap), so you have fewer unmapped reads and thus less bias.

ADD COMMENT • link 9.6 years ago by Brian Bushnell 20k

0

Entering edit mode

Okay. I will try that.

ADD REPLY • link 9.6 years ago by GouthamAtla 12k

0

Entering edit mode

Note that BBMap has a parameter "maxindel" which defaults to "maxindel=16000". This is fine for plants, fungi, and microbes, but if you are sequencing vertebrates (or anything else with introns longer than ~16kbp) you should increase it to about the 98th percentile of intron length in that organism (in mammals this means around 100kbp to 200kbp). All other parameters can be left as default.

ADD REPLY • link updated 2.4 years ago by Ram 44k • written 9.6 years ago by Brian Bushnell 20k

0

Entering edit mode

Hi, is BBMap output comparable with cufflinks/StringTie ?

ADD REPLY • link 9.6 years ago by GouthamAtla 12k

0

Entering edit mode

It's sam, so you can conver it and sort it with bam. By the way, did you trim your reads for quality?

ADD REPLY • link 9.6 years ago by apelin20 ▴ 480

0

Entering edit mode

For Cufflinks, you should add the flag xs=firststrand or whatever because Cufflinks needs that, and intronlen=10 to make introns in cigar strings printed as 'N' instead of 'D'. If samtools is installed, BBMap can directly output bam files rather than sam files, if you name the output file something.bam.

ADD REPLY • link updated 2.4 years ago by Ram 44k • written 9.6 years ago by Brian Bushnell 20k

score 2 · Accepted Answer · 2015-04-25

2

Entering edit mode

9.6 years ago

GouthamAtla 12k

I have tried STAR and the mapping percentage increased up to 90-92% ( with tophat2, it was only up to 75-85%). I will try BBMap soon.

ADD COMMENT • link 9.6 years ago by GouthamAtla 12k