PacBio reads mapping statistics abnormaly high
1
2
Entering edit mode
8.7 years ago

Hi,

I have used bwa mem 0.7.12 to map PacBio reads on a reference genome. Then I want to have some statistics on this mapping. When I count the total number of reads in the starting fastq file, the result is 250,000) I have used my favorite tool : Qualimap2.2 But in the result the number of total reads and mapped reads are abnormaly high : resp 612,000/598,000 I have tried also with samtools 1.3 (samtools flagstat). But the result is the same : the number of total reads and mapped reads are abnormaly high : resp 612,000/598,000

How explain this huge difference ?

Am I doing something wrong ?

Thanks for your help and expertise

Alexis

pacbio mapping • 2.5k views
ADD COMMENT
2
Entering edit mode

Check what happens to the reads that map to multiple places equally well. Sometimes aligner report all such alignments, then the number of mapped reads could be higher that the total number of reads. Also, are you aligning fastq files or raw PacBio files?

ADD REPLY
0
Entering edit mode

Also I would recommend to subsample to small number of reads (e.g. 20), run mapping and look at the alignments manually.

ADD REPLY
0
Entering edit mode

Good suggestion, I will try. Thanks !

ADD REPLY
0
Entering edit mode

Hi Noolean, I'm aligning fastq files

ADD REPLY
2
Entering edit mode
8.7 years ago

I suspect reads mapping to repetitive elements.

ADD COMMENT

Login before adding your answer.

Traffic: 2744 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6