The meaning of '0' allele frequency in vcftools output
0
0
Entering edit mode
7.2 years ago

Hello!

In the output from vcftools '--freq' option, you could get for example:

CHROM   POS N_ALLELES   N_CHR   {ALLELE:FREQ}
1   861276  2   698 A:1 G:0
1   861292  2   698 C:1 G:0
1   861298  2   698 G:1 A:0
1   861315  2   698 G:1 A:0

It looks wonky because of single tab delimiters, but the point here is that the listed alleles are of frequency 1 and 0. What does that mean? Why mention the second allele if its frequency is 0?

Is it a rounding problem?

I have 349 individuals, so I'm thinking an allele frequency can't be lower than 1/349, which is far from the double precision float rounding limit...

Grateful for some lightshed! :-]

vcf vcftools MAF allele frequency allele • 3.7k views
ADD COMMENT
0
Entering edit mode

Why couldn't it be 0, if they're all homozygous reference?

ADD REPLY
0
Entering edit mode

Sure but look at the positions. Some positions (e.g. 861277-861291) are skipped, and I take that to mean 'no alleles found'?

ADD REPLY
0
Entering edit mode

That really depends on how you did the genotyping.

ADD REPLY
0
Entering edit mode

What exactly does that encompass?

ADD REPLY
0
Entering edit mode

ie, did you use GATK? Samtools + Freebayes? Something else? Joint Calling? Filter the output dataset? - To try and determine why there are gaps in your report. There may have been a quality filter applied in which the missing bases didn't pass quality thresholds, so it's omitted, but without more information, it's hard to say.

ADD REPLY
0
Entering edit mode

I don't know how the analysis that produced my source .vcf file was carried out, but the output in my original post is from vcftools.

I'll ask the .vcf file supplier if they know...

ADD REPLY

Login before adding your answer.

Traffic: 2351 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6