How To Check That Tophat Works Well
1
0
Entering edit mode
10.6 years ago
M K ▴ 660

I used tophat to map my RNAseq reads to human genome hg19 (My data is single-end) and I got some files in the tophat output folder. I need to know how many reads mapped to the reference genome and how many don't mapped and the percentage of overall alignment.

BTW, I used the following samtools command

        samtools flagstat accepted_hits.bam

and It gave me the following results:

  175638490 + 0 in total (QC-passed reads + QC-failed reads)
   0 + 0 duplicates
   175638490 + 0 mapped (100.00%:-nan%)
   0 + 0 paired in sequencing
   0 + 0 read1
   0 + 0 read2
   0 + 0 properly paired (-nan%:-nan%)
   0 + 0 with itself and mate mapped
   0 + 0 singletons (-nan%:-nan%)
   0 + 0 with mate mapped to a different chr
   0 + 0 with mate mapped to a different chr (mapQ>=5)

Any explanation of the zeros above

tophat • 3.2k views
ADD COMMENT
0
Entering edit mode
10.6 years ago
Martombo ★ 3.1k

have a look at the file align_summary.txt, you cannot look at the mapping statistics of accepted_hits.bam because it contains only the mapped reads (the unmapped are in unmapped.bam). from what you wrote, you can only deduce that 175638490 of your reads are mapped. all the zeros you see refer to paired end statistics, since your data is not paired they are all zero

ADD COMMENT
0
Entering edit mode

Thanks Martombo. Could you please tell me when I can find align_summary.txt.

ADD REPLY
0
Entering edit mode

it should be in the output folder of tophat

ADD REPLY
0
Entering edit mode

Hi Martombo, Could you please tell me where this file exactly located becuse I didn't find it in my output folder.

ADD REPLY

Login before adding your answer.

Traffic: 1545 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6