Sanger Imputation Server - genotype probability distribution
1
0
Entering edit mode
6.3 years ago

Hello

I have to do an imputation using Sanger Imputation Server.

I have prepared data (which is aligned with reference panel) and submited, but i receveid an e-mail as follows:

Update from Sanger Imputation Service:

--- Aborted Job --- The genotype probability distribution in the input file does not match the reference panel frequencies well. The number of genotypes expected with low frequencies under HWE (with P<=0.1) is too big in the user data: 0.59 whereas the threshold is 0.26. For comparison, the number of these genotypes in 1000Genomes data is 0.17, the attached plot shows typical GT distributions

This is usually an indicator of REF,ALT alleles being on incorrect strand. Another frequent problem is the VCF using a different reference sequence, for example GRCh38 instead of GRCh37.

The attached graph was produced using the bcftools/af-dist plugin, check these links http://samtools.github.io/bcftools/howtos/plugin.af-dist.html http://samtools.github.io/bcftools/howtos/plugin.fixref.html

--- Help --- Please check these links for help

    http://imputation.sanger.ac.uk/?resources=1
    http://imputation.sanger.ac.uk/?instructions=1

How can I solve that???

Thanks

SNP alignment software error Sanger Imputation • 2.0k views
ADD COMMENT
0
Entering edit mode
2.8 years ago
Dan ▴ 540

Check the file with the VCF debugulator: https://github.com/EBIvariation/vcf-validator

It will tell you which positions match the reference or not (good to check reference).

Not sure how you could have strand swapped the calls, but just go in and check a few.

Is your population very different from those in 1000 Genomes?

Could your read accuracy be low or your alignments bad?

Once you find the problem, then we can suggest a solution.

ADD COMMENT

Login before adding your answer.

Traffic: 1579 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6