VCF files columns-biological explanation.
2
0
Entering edit mode
6.7 years ago
GK1610 ▴ 120

I am working on joint genotype call of gvcf files of 200 samples. This is my first time with vcf data. I get the format. but I am struggling with some basic questions like

What is a reference allele e.g. in hg19 file from 1000 genomes_phase3? Is it the reference allele seen on most of people sequence?

What is an alternate allele? Is it the one which is minor allele at that variant and position?

What is NON_REF allele in alt. allele column?

SNP • 2.2k views
ADD COMMENT
2
Entering edit mode
6.7 years ago

The reference genome is a combination of individuals which got sequenced to generate one haploid set of chromosomes. The nucleotides in this reference are not necessarily the most frequent in the population. Therefore this reference genome is not a human genome, but just something to compare our reads with. It may contain haplotypes that do not exist in reality, because it's from multiple diploid individuals collapsed in a single haploid genome.

For a variant: the reference allele is the nucleotide of the reference genome at that position. An alternate allele is an allele not matching the reference allele. This may be the minor allele, but not necessarily because also the reference allele might be the minor allele.

ADD COMMENT
0
Entering edit mode

Thanks.. This is awesome!

ADD REPLY
0
Entering edit mode

Happy to help. If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.

Upvote|Bookmark|Accept

Cheers, Wouter

ADD REPLY
0
Entering edit mode

Refer to G5, G5A, KG and KG-PROD tags in dbSNP (refer to dbSNP builds for hg19 equivalent NCBI genome) to know reference allele frequency. dbSNP includes allele frequency from 1000 genome and hapmap projects.

ADD REPLY
0
Entering edit mode
ADD COMMENT
0
Entering edit mode

Thanks but this document doesn't explain my questions.

ADD REPLY

Login before adding your answer.

Traffic: 1800 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6