What Is Ad (Allelic Depth) In 1000Genomes Vcf?
2
13
Entering edit mode
13.8 years ago
Chronos ▴ 620

The VCF header says

AD: Allelic depths for the ref and alt alleles in the order listed

I understand what "read depth" and "allele" are, but I do not quite understand what "allelic depth" means.

I've read both VCF4.0 specification and VCF development history, but none of them had the answer.

read allele genome • 33k views
ADD COMMENT
0
Entering edit mode

Hello,

This post is so helpful.

My question is according to quality scoring of variants, the good AD threshold should be >8. Then how we can find our answer in VCF file as AD is written as 9,8?? @Casbon

ADD REPLY
11
Entering edit mode
13.8 years ago
Casbon ★ 3.3k

Allele specific depth. i.e. if ref is 'A' and alt is 'G' and AD is '6,9' you got 6 A reads and 9 G reads.

If this is produced by GATK look in the header output - it usually explains each flag.

EDIT:

Checked GATK, read depth is only those used in calling.

ADD COMMENT
1
Entering edit mode

So, to clarify:

DP is equal to the number of reads used for calling the variant

AD numbers are the total number of reads with each allele (including reads not used for calling due to low quality or whatever).

Seems un-intuitive to me, but I'll take it.

ADD REPLY
0
Entering edit mode

I thought so initially, but then 6+9 should equal DP, correct? However, this is not the case for 1000 genomes data: there the sum of ADs is >= DP.

Or does AD represent a count of per-allele unfiltered reads? DP is for filtered reads only, so it would explain why Sum(AD) >= DP.

ADD REPLY
0
Entering edit mode

Interesting, sounds like a bug to me.

ADD REPLY
0
Entering edit mode

Might be, I'm not sure (thus asked). Here's an example line fragment :

GT:AD:DP:GQ:PL  0/0:1,0:1:3.01:0,3,330/0:4,0:2:6.01:0,6,63    0/0:2,1:1:3:0,3,28      ./.     0/0:1,0:1:2.37:0,2,8    0/0:3,0:1:3.01:0,3,33   0/0:7,0:3:9.03:0,9,109./.     ./.     0/0:11,0:3:9.01:0,9,85  ./.     0/0:37,0:1:2.99:0,3,23  ./.     0/0:9,0:5:15.03:0,15,158        ./.     ./.  0/1:16,1:2:21.91:29,0,22 ./.     ./.     0/0:3,0:1:3.01:0,3,33   0/0:2,1:2:6.01:0,6,59

This is X:60009 A>C from ALL.chrX.BI_Beagle.20100804.genotypes.vcf.gz.

ADD REPLY
4
Entering edit mode
8.0 years ago
anderspitman ▴ 70

From the GATK forum:

AD and DP : Allele depth and depth of coverage.

These are complementary fields that represent two important ways of thinking about the depth of the data for this sample at this site.

AD is the unfiltered allele depth, i.e. the number of reads that support each of the reported alleles. All reads at the position (including reads that did not pass the variant caller’s filters) are included in this number, except reads that were considered uninformative. Reads are considered uninformative when they do not provide enough statistical evidence to support one allele over another.

DP is the filtered depth, at the sample level. This gives you the number of filtered reads that support each of the reported alleles. You can check the variant caller’s documentation to see which filters are applied by default. Only reads that passed the variant caller’s filters are included in this number. However, unlike the AD calculation, uninformative reads are included in DP.

ADD COMMENT
0
Entering edit mode

I think you've actually got it backwards. From it looks like DP is all reads (informative and uninformative) whereas AD is only informative reads.

ADD REPLY

Login before adding your answer.

Traffic: 2445 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6