Why # Of Reads From Accepted_Hits.Bam + Unmapped.Bam > # Of Reads In Fastq File?
1
2
Entering edit mode
11.3 years ago
newDNASeqer ▴ 790

a quick question:

After running Tophat with a fastQ files, I found the # of reads from (accepted_hits.bam and unmapped.bam) is greater than the # of reads in fastQ file. Why is this? I thought the accepted_hits.bam plus unmapped should add up to the total # of reads that tophat started with.

I use samtools view -c to count the total reads in both accepted_hits.bam and unmapped.bam, and used grep "^@" to count the # of reads in fastQ file.

tophat fastq reads • 3.2k views
ADD COMMENT
3
Entering edit mode
11.3 years ago
S_Z ▴ 30

checking following setting of tophat: -g/--max-multihits

ADD COMMENT
0
Entering edit mode

Basically, the number of entries in the bam is not the number of READS, but the number of ALIGNMENTS. And if there are multiple alignments allowed per read, you will have more alignments than reads.

ADD REPLY

Login before adding your answer.

Traffic: 2032 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6