Convert Genotype SNP Matrix Into Plink Format, allelic form
1
0
Entering edit mode
6.7 years ago
Kian ▴ 50

I have a genotype matrix (near 3000 animal with 50 000 SNP in columns). It's coded as 0/1/2 or NA. I want to convert this into plink format in form allelic format for example 0 to 0 0, 1 to 1 1 and 2 to 2 2. this is a format for PLINK for quaity control my data, What's the best way to do this in R?

SNP plink format allelic R • 4.3k views
ADD COMMENT
0
Entering edit mode

What data format do you've: dosage, ped or some other?

ADD REPLY
0
Entering edit mode

Thanks for response, i have a ped file but with codes 0 1 2, But plink needs codes by allelic format and 0 should be 0 0, and 1 should be 1 1 also 2 should be 2 2 for example. i didnat access allelic format and question is how i can prepare this ped file?

ADD REPLY
1
Entering edit mode
6.7 years ago

Why should 1 be 11 and 2 be 22? You currently have the data in 012 format, which relates to:

  • 0 (zero) minor alleles (ref)
  • 1 minor allele (het)
  • 2 minor alleles (hom)

To produce PLINK data in 012 format, you first have to recode it using the 012 flag (see HERE), i.e., within Plink itself. So, from where did you get the file? You (or the source from where you got it) should already have the data in the format that you require.

In Plink PED format, genotypes can be encode numerically or as characters, as follows:

  • A=1
  • C=2
  • G=3
  • T=4

-----------------------------

So, as you can see, in order to connect the 012 format to the original PED format, you need mapping information in order to understand which allele (ACGT or 1234) was the minor allele and which was the major. Without that mapping, you cannot convert back. You need that extra information.

...of course, as I have already mentioned, 012 format is produced from PED (or BED) in Plink itself. So, either you or your source has the original file that you need.

Kevin

ADD COMMENT
0
Entering edit mode

Thanks Dear Kevin for response This is example a file that i have, the markers i think should convert to allelic format require for plink.

           id rs147433 rs146888 rs146888 rs146888 rs146887
           1 0200s1         -9          1          2          1          2
           2 0200s1005         -9          0          1          2          2
           3 0200s1021         -9          1          1          1          0
           4 0200s1028         -9          0          1          1          2
           5 0200s103         -9          0          1          1          1
ADD REPLY

Login before adding your answer.

Traffic: 1995 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6