Hi everyone,
I have a binary PLINK fileset in which alleles are coded 1/2 and failed calls are denoted by 0. The 1/2 format is based off the A/B format, where 1 = A and 2 = B. I want to recode the alleles as ACGT. I would like to do so using PLINK ideally, as it is not advised to modify only the .bim file.
As far as I understand, none of the PLINK --recode
options will work for this problem.
The relationships between 1/2 allele codes and ACGT allele codes are as follows.
In general, the 1/2 coding can be converted to ACGT according to the following:
SNP [T/G]: 1 = T, 2 = G;
SNP [A/G]: 1 = A, 2 = G;
SNP [T/C]: 1 = T, 2 = C;
SNP [A/C]: 1 = A, 2 = C.
However, there are still A/T and C/G SNPs that do not follow this convention. In these cases, it is important to know the strand. They can be converted to ACGT according to the following:
SNP [A/T]: 1 = A, 2 = T, when strand +,
SNP [A/T]: 1 = T, 2 = A, when strand -,
SNP [C/G]: 1 = C, 2 = G, when strand +,
SNP [C/G]: 1 = G, 2 = C, when strand -.