Why difference in Hisat2 and bamqc from qualimap results?
0
0
Entering edit mode
6.6 years ago
Vasu ▴ 790

Hi,

I'm using hisat2 for aligning reads to the genome. For a few samples I see some differences by using hisat2 and bamqc from qualimap.

Hisat2 output:

37317546 reads; of these:
  37317546 (100.00%) were paired; of these:
    14771091 (39.58%) aligned concordantly 0 times
    7081700 (18.98%) aligned concordantly exactly 1 time
    15464755 (41.44%) aligned concordantly >1 times
    ----
    14771091 pairs aligned concordantly 0 times; of these:
      1186424 (8.03%) aligned discordantly 1 time
    ----
    13584667 pairs aligned 0 times concordantly or discordantly; of these:
      27169334 mates make up the pairs; of these:
        22785681 (83.87%) aligned 0 times
        1973892 (7.27%) aligned exactly 1 time
        2409761 (8.87%) aligned >1 times
69.47% overall alignment rate

For the same sample using bam file "qualimap bamqc results" are as following:

Reference

 number of bases = 3,099,750,718 bp
 number of contigs = 194


Globals

 number of windows = 593

 number of reads = 202,671,876
 number of mapped reads = 179,886,195 (88.76%)

 number of mapped paired reads (first in pair) = 90,666,939
 number of mapped paired reads (second in pair) = 89,219,256
 number of mapped paired reads (both in pair) = 171,622,685
 number of mapped paired reads (singletons) = 8,263,510
 number of mapped bases = 30,000,606,541 bp
 number of sequenced bases = 8,238,989,876 bp
 number of aligned bases = 0 bp
 number of duplicated reads (estimated) = 95,761,476
 duplication rate = 25.6%


 Insert size

 mean insert size = 29,714.41
 std insert size = 464,081.65
 median insert size = 1199


Mapping quality

 mean mapping quality = 13.82


ACTG content

 number of A's = 1,679,640,440 bp (20.39%)
 number of C's = 2,133,982,067 bp (25.9%)
 number of T's = 1,805,802,126 bp (21.92%)
 number of G's = 2,619,565,243 bp (31.79%)
 number of N's = 0 bp (0%)

 GC percentage = 57.7%


 Mismatches and indels

general error rate = 0
number of mismatches = 32,158,659
number of insertions = 876,201
mapped reads with insertion percentage = 0.49%
number of deletions = 174,885
mapped reads with deletion percentage = 0.1%
homopolymer indels = 24.62%

In hisat2 output I see overall alignment rate is 69.47% and bamqc results I see number of mapped reads is 88%. Which is right one?

RNA-Seq hisat2 qualimap bamqc • 2.9k views
ADD COMMENT
2
Entering edit mode

Both metrics are right. In your bam file, you have 88% of mapped reads. From your input reads, only 69% are mapped (once or more than once). The mutlitple alignments are causing the difference

ADD REPLY
0
Entering edit mode

Ok. And how can I get unmapped reads percentage? 88% of mapped reads is once?

ADD REPLY
0
Entering edit mode

The percentage of unmapped reads (compared to the total number of reads) is 30.53%.

The percentage of unmapped reads (compared to the total number of alignments in the bam file) is 12%.

ADD REPLY
0
Entering edit mode

Thank you. But could you please tell me how this total number of reads and total number of alignments are different? And could you also tell me how u calculated the above percentages.

ADD REPLY
1
Entering edit mode

But could you please tell me how this total number of reads and total number of alignments are different?

Because for one read, there can be more than one alignment : (41.44%) aligned concordantly >1 times

And could you also tell me how u calculated the above percentages.

100% - 69.47% = 30.53% (1 - (number of reads that map at least once/total number of reads) = proportion of unmapped reads)

100% - 88% = 12% (1 - (number of effective alignments/total number of entries in the bam file) = proportion of unmapped reads in the bam file)

ADD REPLY
1
Entering edit mode

Thank you very much. I guess there is a typo in ur comment. It should be 100% - 88% = 12%.

ADD REPLY

Login before adding your answer.

Traffic: 2725 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6