PF_INDEL_RATE calculation and definition
1
0
Entering edit mode
23 months ago
bitpir ▴ 250

Hello,

I was hoping someone could verify how PF_INDEL_RATE is calculated in AlignmentSummaryMetrics.

The definition says: "PF_INDEL_RATE: The number of insertion and deletion events per 100 aligned bases. Uses the number of events as the numerator, not the number of inserted or deleted bases".

The code on github for calculating PF_INDEL_RATE is this:

metrics.PF_INDEL_RATE = MathUtil.divide(indels, (double) metrics.PF_ALIGNED_BASES);

I can't seem to find anywhere that suggests the rate is "per 100 aligned bases". Just want to verify that in a sample with PF_ALIGNED_BASES of 115694467, a PF_INDEL_RATE of 0.0002 would mean that there are 2 indel events in every 1000000 bases (0.0002 in 100 aligned bases), and not 23,139 indel events in 115694467 aligned bases (0.0002 * 115694467).

Thanks!

picard AlignmentSummaryMetrics picardmetrics • 748 views
ADD COMMENT
0
Entering edit mode
23 months ago
bitpir ▴ 250

Think I just answered my own question. As I suspected, PF_INDEL_RATE was calculated as the code was written (not events per 100 aligned bases) :

#indel events / PF_ALIGNED_BASES

I did a quick calculation of calculating INDELs from the CIGAR string:

samtools view <bam> | cut -f6 | grep "I\|D" | sort | wc -l   
#I = insertion; D = deletion

my INDEL count came to 39262 and from the AlignmentSummaryMetrics, my PF_ALIGNED_BASES was 235464581, which means my indel rate is ~0.00017. The number came close to the reported PF_INDEL_RATE (pair) which was 0.00018.

Hope this helps!

ADD COMMENT

Login before adding your answer.

Traffic: 1860 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6