Entering edit mode
8.5 years ago
userT
•
0
Hello everyone,
I have the data in format described below, which ?_? is missing genotype. Is there any package in R to convert from this format to PLINK (.map and .ped)?
Sample ,SNP1, SNP2, SNP3, SNP4
H1 , A_B, ?_?, B_B, A_A
H2 , A_B, ?_?, A_B, A_A
H3 , A_B, ?_?, B_B, ?_?
From a file like this it is not really possible to make a proper map and ped since you are missing information for the map file:
map file: 1 rs123456 0 1234555 = chromosome - SNP name - 0 - bp position ped file: H1 H2 0 0 1 1 = Family ID - Individual ID - Paternal ID -Maternal ID - Sex (1=male; 2=female; other=unknown) - Phenotype
But to make the best of it, I would just save the file to a txt file and convert using sed:
sed -i -e 's/\?_?/0 0/g' filename
sed -i -e 's/\A_B/A\ G/g' filename sed -i -e 's/\B_B/G\ G/g' filename sed -i -e 's/\A_A/G\ A/g' filename
Then manually changed a few columns and make manually a map file