Hi,
I have used bwa mem 0.7.12 to map PacBio reads on a reference genome. Then I want to have some statistics on this mapping. When I count the total number of reads in the starting fastq file, the result is 250,000) I have used my favorite tool : Qualimap2.2 But in the result the number of total reads and mapped reads are abnormaly high : resp 612,000/598,000 I have tried also with samtools 1.3 (samtools flagstat). But the result is the same : the number of total reads and mapped reads are abnormaly high : resp 612,000/598,000
How explain this huge difference ?
Am I doing something wrong ?
Thanks for your help and expertise
Alexis
Check what happens to the reads that map to multiple places equally well. Sometimes aligner report all such alignments, then the number of mapped reads could be higher that the total number of reads. Also, are you aligning fastq files or raw PacBio files?
Also I would recommend to subsample to small number of reads (e.g. 20), run mapping and look at the alignments manually.
Good suggestion, I will try. Thanks !
Hi Noolean, I'm aligning fastq files