Forum:Pros and cons of using MAF values from earlier as well as from latest release of 1000g for selection of rare variants.
1
0
Entering edit mode
9.9 years ago
ShahiRB ▴ 30

Can anyone explain me the pros and cons of using MAF values from "earlier" along with "latest" release of 1000g for the selection of rare variants [I am more curious about rationale of using earlier releases]

next-gen genome SNP • 3.7k views
ADD COMMENT
0
Entering edit mode

You must have an automatic spelling corrector converting the word "from" to "form" ;-)

ADD REPLY
0
Entering edit mode
9.9 years ago
Vivek ★ 2.7k

There's a discord between the variant calls released as part of Phase 1 and Phase 3 by about 2 million variants. Most recently the 1000 genomes consortium released the following information to account for this. So its entirely your preference on how you'd want to treat a high MAF variant that was may be called in Phase 1 but deleted in Phase 3.

sample_dropout - sites missing because the samples carry the ALT alleles were dropped in phase3;
not_called_in_p3 - sites missing because they were not called initially by any phase3 caller;
failed_svm_filter - sites missing because they failed phase3 SVM filter;
ovl_p3_larger_event - sites missing because they overlap with phase3 released indels or SVs; or they overlap with unfiltered mvncall set;
not_in_p3_shapeit2_GL_files - sites missing because they are not included in the phase3 shapeit2 genotype likelihood files;
possible_patchup_candidate - missing SNP sites that cannot be well explained. They may be possible patch up candidates;
indel_unknown - missing indel sites that cannot be explained by sample dropout;
sv_unknown - missing structural variation sites that cannot be explained by sample dropout;
ADD COMMENT

Login before adding your answer.

Traffic: 2683 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6