Convert Plink's ped file into compound genotype format and 0/1/2 format
1
0
Entering edit mode
8.6 years ago
PAn ▴ 20

Hello everyone,

I have a regular PLINK ped file and I need to convert it to SNP by Sample format, I did that using .tped format.

1 snp1 0 5000650 A A A C C C A C C C C C
1 snp2 0 5000830 G T G T G G T T G T T T

Is there anyway I can convert it into - 1) compound genotype format

1 snp1 0 5000650 AA AC CC AC CC CC
1 snp2 0 5000830 GT GT GG TT GT TT

2) and convert it to single coded files -

1 snp1 0 5000650 0 1 2 1 2 2
1 snp2 0 5000830 1 1 2 0 1 2

Thanks a lot!

plink • 6.2k views
ADD COMMENT
1
Entering edit mode

Use the --recodeA command to your line.

http://pngu.mgh.harvard.edu/~purcell/plink/dataman.shtml

ADD REPLY
0
Entering edit mode

Thanks a lot that works! I tried the recodeAD and dint quite understand the results I guess. Edit - I am getting the .raw file in

FID IID PAT MAT SEX Phenotype SNP1 SNP2 
Sample1
Sample2

Sample vs SNP format. Is there anyway to get it in SNP vs Sample format? Thanks! I can try to transpose in bash but the file is too big and it would be nice to know if there is any in built function

ADD REPLY
0
Entering edit mode

How many samples and snps you have?

ADD REPLY
0
Entering edit mode

Its around 1600 samples and 23 million SNPs. I am trying awk to transpose, but the file is huge so its taking forever!

ADD REPLY
0
Entering edit mode

First this: sed '1q;d' FILE.txt > 1.txt sed '2q;d' FILE.txt > 2.txt sed '3q;d' FILE.txt > 3.txt etc

Then this: tr ' ' '\n' < 1.txt > 1_t.txt tr ' ' '\n' < 2.txt > 2_t.txt tr ' ' '\n' < 3.txt > 3_t.txt etc

or oneliner: sed '1q;d' FILE.txt | tr ' ' '\n' > 1.txt etc

and then finally:

paste 1_t.txt 2_t.txt 3_t.txt etc > final_file

ADD REPLY
1
Entering edit mode
7.4 years ago
grum.gebre ▴ 10

You can use Plink's compound-genotype modifier together with the --recode flag to change it to compound genotype:

plink --file --recode compound-genotypes --out

ADD COMMENT

Login before adding your answer.

Traffic: 2914 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6