NS, AC, AN and AF in VCF file
0
1
Entering edit mode
2.3 years ago
wangdp123 ▴ 340

Hi there,

I am working with some VCF files and I find it difficult to understand some of the tags in the INFO column of VCF files.

1) Does NS (number of samples with data) refer to the number of samples with genotype "0/1" or "1/1"?

2) Does AC (allele count in genotypes) refer to the number of "1" in the genotypes "0/1" and "1/1"?

3) Does AN (total number of alleles in called genotypes) refer to the number of "0" and "1" in all three genotypes namely "0/0", "0/1" and "1/1"?

4) Should be AF (allele frequency) calculated as AF = AC/AN where AC is allele count in genotypes and AN is total number of alleles in called genotypes? Or should another formula used AF=AC/(number of samples 2)? Usually AN is not equal to (number of samples 2) due to missing alleles in the GT field.

5) I see some INFO tags in VCF files generated by bcftools (Plugin fill-tags) where AF=AC/(number of samples * 2) rather than AF = AC/AN is used? why is that? In addition, NS and AN provided by bcftools (Plugin fill-tags) seem not correct compared to when I manually count them according to the above-mentioned definitions? Does anyone come across this before?

Many thanks,

Kind regards,

Tom

VCF • 2.8k views
ADD COMMENT
0
Entering edit mode

Dear Tom, Did you resolve this? I need this exact information. Please share what you finally found out

ADD REPLY

Login before adding your answer.

Traffic: 1291 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6