Ways to convert .impute2 to .mldose
3
4
Entering edit mode
10.5 years ago

Does anyone know of any software package that can convert .impute2 data to .mldose (i.e. imputed data from IMPUTE2 to imputed data from MACH)? I have tried impute2mach in GenABEL, but it has repeatedly failed with a known error...

Direct conversion preferred, but I'm happy with indirect as long as it works!

SNP Imputation • 9.3k views
ADD COMMENT
0
Entering edit mode

I have the same probleme, could you please tell me how to write the R script to made the dosage file ?

ADD REPLY
6
Entering edit mode
10.5 years ago
zx8754 12k

IMPUTE output is posterior probabilities, 3 columns for 1 SNP, e.g.: 0.1 0.1 0.8 corresponding to AA AB BB. Meaning it is more likely to be BB.

MACH outputs dosage, 1 column for 1 SNP, e.g.: 1.7, corresponding 0=AA, 1=AB, 2=BB, so 1.7 is more likely to be BB.

Depending on the size of the data, you can write a quick script in R that would convert posterior probabilities to dosage:

0.1*0 + 0.1*1 + 0.8*2 = 1.7
ADD COMMENT
1
Entering edit mode

Thanks - that's useful, I'll give it a try!

ADD REPLY
5
Entering edit mode
10.4 years ago

I have written a brief cookbook to perform this conversion in UNIX:

http://openwetware.org/wiki/User:Jonathan_R._I._Coleman/Notebook/Notes_and_Protocols/2014/06/27

ADD COMMENT
1
Entering edit mode

The awk code for this has now been cleaned to make it more efficient (credit: Tommy Carstensen).

Alternatively, the Uni of Washington has an R package for post-Impute2 conversions: http://www.bioconductor.org/packages/release/bioc/manuals/GWASTools/man/GWASTools.pdf

ADD REPLY
1
Entering edit mode

How does your code deal with "0 0 0", i.e.: no call? Does it convert to Ref as 0*0+0*1+0*2=0 or NA?

ADD REPLY
0
Entering edit mode

My code at the time you have accessed it would convert 0 0 0 to 0 (which is wrong! I think I assumed a non-call would default to 0.33 0.33 0.33, but it doesn't). This is easily fixable (get it to set 0 0 0 to NA) - I will implement a patch. Thanks for pointing this out!

ADD REPLY
0
Entering edit mode

Thanks for the script for mldose format. Can you please tell me how to get mlinfo file i.e. how to convert impute2_info into mlinfo format? Thanks! Best Wishes, Meraj

ADD REPLY
1
Entering edit mode

This is less straightforward - the information metrics produced by these programs are not totally equivalent, although highly correlated (see Marchini and Howie, 2010 www.nature.com/nrg/journal/v11/n7/extref/nrg2796-s3.pdf). The files themselves are also structured differently:

mlinfo:

SNP     Al1     Al2     Freq1   MAF     Quality Rsq
rs11089130      C       G       0.3362  0.3362  0.4776  0.0160

.impute2_info

snp_id rs_id position exp_freq_a1 info certainty type info_type0 concord_type0 r2_type0
--- rs9628072 50000058 0.033 0.626 0.969 0 -1 -1 -1

It would be possible to convert these with a bit of additional information (the alleles of each variant, which should be able to be obtained from the reference panel used for imputation). However, I'm not sure whether the conversion is necessary - you could simply filter on the impute2 info metric to obtain a list of SNPs to retain for analysis?

ADD REPLY
2
Entering edit mode
10.5 years ago
Joey ▴ 430

You can try a program called fcgene. Look into Chapter 7 of the manual which accompanies the tool.

Thanks,

-Joey

ADD COMMENT
0
Entering edit mode

Thanks Joey - I've already looked into this programme. Unfortunately, I think it only converts impute2 files to MACH input files (.dat and .ped), whereas I am looking for the output files.

ADD REPLY

Login before adding your answer.

Traffic: 1956 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6