Post-imputation steps (cleaning and converting to plink)
1
0
Entering edit mode
4.4 years ago
kl ▴ 10

Hi,

I have just finished imputing my genetic data on the Michigan server (filtered on r2=0.3) and I want to convert to plink. What are the next steps such as cleaning and converting to plink? Anyone have any useful scripts and to do this over all chromosomes?

Any advice would be much appreciated!

Thanks

SNP MICHIGAN IMPUTATION SERVER • 3.3k views
ADD COMMENT
0
Entering edit mode

Hmmm.. I've found some commands to annotate and convert to plink. However, the output is definitely not a correct plink .bim file! I tried to also just convert into plink without annotating (in case that caused it) but the same thing happens. the second column is chr:pos:A1:A2:snp or chr:pos:A1:A2 (without the annotation step). Do you have any idea why this may happen?

plink --vcf chr22_rs.vcf --keep-allele-order --double-id --make-bed --out test

22      22:16053843:G:A;rs181029838     0       16053843        A       G
22      22:16054839:G:A;rs73877820      0       16054839        A       G
22      22:16055207:C:T;rs7291810       0       16055207        T       C
22      22:16055230:T:C;rs140593956     0       16055230        C       T
22      22:16055965:G:T;rs587706951     0       16055965        T       G

Thanks

ADD REPLY
0
Entering edit mode

The files downloaded from the Michigan server are v4.1 using minimac4.

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. SUBMIT ANSWER is for new answers to original question.

ADD REPLY
0
Entering edit mode

I think I've tried that before but couldn't get it work as it needs a C++ library which I couldn't seem to install. I'll try and see if I can figure it out if its the only way to convert! Thanks for the advice.

ADD REPLY
0
Entering edit mode

Please post the error about the C library, if possible. Generally, working with these genetics programs is a tiring affair.

ADD REPLY
0
Entering edit mode

Yes, I'm not tech-savvy so it makes it even more tiring! So, I've closed it into my directory using Git and then, the next step is downloading cget. I do this in the directory of DosageConvertor (not sure if this is what is needed but it is unclear)

pip install cget

Requirement already satisfied: cget in /home/kl/.local/lib/python3.7/site-packages (0.1.9)
Requirement already satisfied: six>=1.10 in /cm/shared/languages/anaconda3-2019.10/lib/python3.7/site-packages (from cget) (1.12.0)
Requirement already satisfied: click>=6.6 in /cm/shared/languages/anaconda3-2019.10/lib/python3.7/site-packages (from cget) (7.0)

Then I type bash install.sh and I have the following message:

/cm/shared/languages/anaconda3-2019.10/bin:/cm/shared/apps/Python-Meep-1.3/bin:/cm/shared/apps/openmpi/gcc/64/1.6.5/bin:/cm/shared/languages/Python-2.7.6/bin:/cm/shared/languages/GCC-4.8.4/bin:/cm/shared/tools/git-2.22.0/bin:/cm/shared/languages/GCC-7.1.0/bin:/cm/shared/apps/moab/7.2.9/sbin:/cm/shared/apps/moab/7.2.9/bin:/cm/shared/languages/GCC-6.1/bin:/cm/shared/languages/R-3.6.2/bin:/cm/shared/apps/Java-JDK-11.0.3/jdk-11.0.3/bin:/cm/shared/languages/GCC-9.1.0/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/sbin:/opt/dell/srvadmin/bin:/cm/shared/apps/Plink2:/cm/shared/apps/Bolt-LMM-2.3/BOLT-LMM_v2.3:/cm/shared/apps/bcftools-1.8/bcftools:/cm/shared/apps/bcftools-1.8/htslib-1.8:/cm/shared/apps/Tabix-0.2.6/tabix-0.2.6:/sbin:/usr/sbin:.:/cm/shared/apps/torque/4.2.4.1/bin:/cm/shared/apps/torque/4.2.4.1/sbin:/cm/shared/apps/hdf5/1.6.10/bin:/cm/shared/libraries/gnu_builds/gsl-1.16/bin)
Error: cget not installed. Please run 'pip install --user cget'

So, I am not sure what I've done wrong!

Thanks for your help!

ADD REPLY
0
Entering edit mode

hmm, I wonder could you try to do all of this in a dedicated conda environment? This usually overcomes issues like this. For install cget, it would then be: https://anaconda.org/compbiocore/cget

ADD REPLY
0
Entering edit mode
4.4 years ago

Surely Michigan Imputation Server has information on this? In which format do they produce data?

I have an entire pre-phasing and imputation workflow, here (but not via Michigan): C: Phasing with SHAPEIT

Kevin

ADD COMMENT
0
Entering edit mode

Hmmm.. I've found some commands to annotate and convert to plink. However, the output is definitely not a correct plink .bim file! I tried to also just convert into plink without annotating (in case that caused it) but the same thing happens. the second column is chr:pos:A1:A2:snp or chr:pos:A1:A2 (without the annotation step). Do you have any idea why this may happen?

plink --vcf chr22_rs.vcf --keep-allele-order --double-id --make-bed --out test

22      22:16053843:G:A;rs181029838     0       16053843        A       G
22      22:16054839:G:A;rs73877820      0       16054839        A       G
22      22:16055207:C:T;rs7291810       0       16055207        T       C
22      22:16055230:T:C;rs140593956     0       16055230        C       T
22      22:16055965:G:T;rs587706951     0       16055965        T       G

Thanks

ADD REPLY
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.
code_formatting

ADD REPLY
0
Entering edit mode

Hi, there is information here about how to do it: https://genome.sph.umich.edu/wiki/DosageConvertor#Convert_to_PLINK_Files

ADD REPLY

Login before adding your answer.

Traffic: 1748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6