tophat why are my bam files different if bowtie doesn't care about strandedness?
2
0
Entering edit mode
8.4 years ago

if I run tophat on this some.fastq file

@Read_num:9:trans:"ENST00000618181":start:935873:exons:"ENSE00002686739":"ENSE00002703998" TGGAGATTGGCCTGCGACCCGCCGGTGACCTGTTGGGCAAGAGGCTGGGC

+

GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

tophat -p 12 -o fr-firststrand -a 8 --library-type fr-firststrand --coverage-search Bowtie2Index/Homo_sapiens.GRC38 some.fastq

and

tophat -p 12 -o fr-secondstrand -a 8 --library-type fr-secondstrand --coverage-search Bowtie2Index/Homo_sapiens.GRC38 some.fastq

Read_num:9 gets put in the unmapped.fq file when library-type = fr-firststrand but gets mapped to accepted_hits.bam. Since tophat uses bowtie to do the mapping and bowtie doesn't care about strand, I don't understand why the bam files generated by (presumably) bowtie are different with different library-types. Where is my understanding breaking down here?

RNA-Seq tophat • 1.9k views
ADD COMMENT
2
Entering edit mode
8.4 years ago

Oh, but bowtie can care about strandedness. In particular, it has things like --norc, which are equivalent to strand-specific alignments.

ADD COMMENT
0
Entering edit mode

This answered my question as posed. However it appears (when looking at the tophat log files) that --norc nor --nofw are used by tophat.

ADD REPLY
0
Entering edit mode

It's an interesting discussion, but you do realize that the developers of TopHat2 have themselves declared their software to be obsolete?

ADD REPLY
0
Entering edit mode

Yes, I realize that hisat2 is the newer program. However, we are more familiar with tophat2 and built our pipeline using. Hisat2 probably has its fair share of gotchya's.

ADD REPLY
0
Entering edit mode

Presumably it does the equivalent in post-processing.

ADD REPLY
0
Entering edit mode
8.4 years ago
ablanchetcohen ★ 1.2k

Bowtie2 has very similar options to Tophat2 regarding the strands.

--fr/--rf/--ff

The upstream/downstream mate orientations for a valid paired-end alignment against the forward reference strand. E.g., if --fr is specified and there is a candidate paired-end alignment where mate 1 appears upstream of the reverse complement of mate 2 and the fragment length constraints (-I and -X) are met, that alignment is valid. Also, if mate 2 appears upstream of the reverse complement of mate 1 and all other constraints are met, that too is valid. --rf likewise requires that an upstream mate1 be reverse-complemented and a downstream mate2 be forward-oriented. --ff requires both an upstream mate 1 and a downstream mate 2 to be forward-oriented. Default: --fr (appropriate for Illumina's Paired-end Sequencing Assay).

Bowtie1 also the samr parameters.

--fr/--rf/--ff

The upstream/downstream mate orientations for a valid paired-end alignment against the forward reference strand. E.g., if --fr is specified and there is a candidate paired-end alignment where mate1 appears upstream of the reverse complement of mate2 and the insert length constraints are met, that alignment is valid. Also, if mate2 appears upstream of the reverse complement of mate1 and all other constraints are met, that too is valid. --rf likewise requires that an upstream mate1 be reverse-complemented and a downstream mate2 be forward-oriented. --ff requires both an upstream mate1 and a downstream mate2 to be forward-oriented. Default: --fr when -C (colorspace alignment) is not specified, --ff when -C is specified.

So, both versions of Bowtie can take into account the strand. Hence, TopHat also can.

ADD COMMENT
0
Entering edit mode

That's a rather different thing. You're talking about relative orientation of mates, which needs to be changed if one has mate pairs. OP is asking about strand-specific alignments.

ADD REPLY
0
Entering edit mode

EDIT. I've deleted my comment in answer to OP's edit. The relative mate orientation is important to determine strand orientation for paired end data, but OP has now clarified his question to be about single end data. So, yes, the answer did not answer the OP's question about strandedness for single end data. The relative mate orientation still appears important to me in determining the strand of paired end data, so I'm hopefully still right about that point.

ADD REPLY

Login before adding your answer.

Traffic: 1612 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6