Question:IMPUTE2: the p-value of snps in the output gen file is different from the original snp in plink.assoc .
1
0
Entering edit mode
10.6 years ago
m338102001 ▴ 10

hi,

i use shapeit to do phasing and then inpute2 to generate the gen file.

but after performing association test to the gen file and finally getting a assoc.dosage file, I found that the p-value of original snp (not the imputed snp) is different from the original one (before phasing and impute).

my original plink file is hg19.

here is my comment line:

shapeit:

./shapeit --input_bed mygwas.chr1.bed mygwas.chr1.bim mygwas.chr1.fam \
    --input-map genetic_map_chr1_combined_b37.txt \
    --output-max mygwas.phased.haps mygwas.phased.sample

impute2:

./impute2 -use_prephased_g \
    -m genetic_map_chr_combined_b37.txt \
    -h ALL_1000G_phase1integrated_v3_chr1_impute.hap \
    -l ALL_1000G_phase1integrated_v3_chr1_impute.legend \
    -known_haps_g mygwas.phased.haps \
    int xxx xxx \
    Ne xxx \
    -o mygwas.gen \
    -phase

./plink --dosage mygwas.gen format=3 dose=1 skip0=1 skip1=1 noheader --fam mygwas.fam --assoc --out mygwas.imputed.

I think the p-value of original SNP in the mygwas.chr1.bed/bim/fam file should be the same with the output mygwas.imputed.assoc.dosage, but the result showed that they are different. and this really confused me.

Could everyone tell me how should i fix this problem? really thanks a lot

software-error snp SNP • 4.4k views
ADD COMMENT
0
Entering edit mode

thank you karl stemm.

I used --assoc command in PLINK to do association test. after imputation, I used the following command to perform association test.

./plink --dosage mygwas.gen format=3 dose=1 skip0=1 skip1=1 noheader --fam mygwas.fam --assoc --out mygwas.imputed.

eg. many SNPs has p-value of 1e-06, but after imputation, it only get 1e-04. (sorry I dont know how to use URL to upload the picture)


is there any possible that shapeit automatically impute missing genotype in my data? if this really a problem, how can I do to prevent it?

ADD REPLY
1
Entering edit mode

IMPUTE2 can impute missing data. Check if the p-value of SNPs with a call rate of 100% is still the same.

ADD REPLY
0
Entering edit mode

thank you, now I know the reason of the changing p-value.

so if I don't want impute2 to perform this autoimpute function, how could I do? because I did'nt see any command that could perform this function in the webpage of IMPUTE2.

ADD REPLY
1
Entering edit mode

You could replace the post-imputation genotypes with the original. Use PLINK to remove (--exclude) the SNP from your post-Imputation file and add your old genotype with --merge.

ADD REPLY
0
Entering edit mode

thank you Maxime Lamontangne ^o^

ADD REPLY
1
Entering edit mode
10.6 years ago

See here for a LOD Score plot along chromosome 1.

http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1000260

I tried to link to figure 4, but please scroll to figure 4 or 5 for an example of a likelihood score plotted against genomic location.

Wouldn't you expect the statistical odds on each SNP to shift when you add more markers?

ADD COMMENT
0
Entering edit mode

dear karl.stamm,

thank you for your reply. but, the p-value of SNP is calculated based on the A/T proportion in samples. let say before imputation, I have SNP A in original data, and after imputation I add SNP B,C,D,E ,but the A/T proportion of SNP A should not be changed, isn't it?

besides, may I ask you additional question that: if the p-value of SNP A have been changed, what is the possible reason for that?

ADD REPLY
0
Entering edit mode

Firstly the A/T proportion may have changed, you need to double-check that. Depending on the imputation, and calling algorithms. Although I suppose it is very unlikely to have changed, it is possible. What does really change is the knowledge of genetic structure. I don't know what test is computing your p-value, and it sounds like you don't either. If this was a single independent SNP test, then it is directly related to A/T proportion, and no other markers should matter. The reason I linked the figures of a LOD Score plot, is because most good association tests will make use of local structure and be influenced by nearby markers. You've added information to the system, so of course a test's result should change. Make a plot of the before and the after and please post them for us to see.

ADD REPLY

Login before adding your answer.

Traffic: 2500 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6