Is There An Explanation For This Tophat "Yt" Descriptor Discrepancy In My Sam Output?
0
1
Entering edit mode
11.2 years ago
Dan D 7.4k

I was given some TCGA BAM files and asked to perform a realignment with some specific requirements. While perusing the results of an alignment in IGV I noticed something strange. As far as I can tell, everything in the read data pop-up dialogs tells me that I'm looking at paired-end reads that mapped as pairs, except for the YT tag which is always UU.

enter image description here

The read names in a mapped pair are 100% identical and pulled from separate FASTQ files. I'm seeing this with every read I check, and I've spot checked reads from random places on five different chromosomes.

Here's the tophat v2.0.9 command that I ran:

/usr/local/bin/tophat --output-dir /data/deedee/rnaseq/efb596b4 --max-multihits 2 -p 4 --b2-very-sensitive --library-type fr-unstranded /data/iGenomes/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/genome efb596b4_R1.fastq efb596b4_R2.fastq

Does anyone have any ideas about what's going on here? More background follows in case it's useful:

In the initial BAM file, the read names were a mess. They had /1 and /2 attached to the end of the read names, sometimes twice. I wrote a script to remove these /1 and /2 values from the ends of the read names. I used bedtools bamtofastq to convert these query-sorted, cleaned BAM files to a pair of FASTQ files. From there I ran the tophat command above.

igv sam tophat bowtie • 3.5k views
ADD COMMENT
1
Entering edit mode

I wonder if this is an artifact of how the reads are aligned. Since the pairs are aligned separately, in part at least, I wonder if tophat just doesn't reset this auxiliary tag.

ADD REPLY
0
Entering edit mode

Very interesting suggestion. I'm going to pursue this further and see what I can find out. Thanks!

ADD REPLY
0
Entering edit mode

Please report back if that turns out to be the case (or not). I'd like to know as well!

ADD REPLY
0
Entering edit mode

I checked some output generated by a colleague and I'm seeing the same thing in those data as well. I bet your suggestion is correct. I went ahead and posted on the Tuxedo Tools message board to see if they can confirm.

ADD REPLY

Login before adding your answer.

Traffic: 2115 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6