Dear All,
do anyone knows a tool or R function that converts plink files into mach format?
Thanks a lot
Dear All,
do anyone knows a tool or R function that converts plink files into mach format?
Thanks a lot
Hey,
In your shell, to go from PLINK's ped to MACH's impute ped you need to remove the phenotype column:
gawk '{ $6 = ""; gsub(FS "+", FS) }1' data.ped > data2.ped
Then create the dat file from PLINK's map file
gawk '{print "M",$2}' data.map > data.dat
Hope that helps.
DAT file:
awk '{$1="M";print $1,$2}' file.map > file.dat
SNP file:
cut -f2 file.map > file.snp
MAP file:
cut -f 1,2,4 file.map > file.V2.map
PED file:
If you want to have your ped file in this format (FAM1001 ID1234 0 0 M A/A A/C C/C)
cut -d ' ' -f1-5,7- file.ped > temp
awk 'BEGIN {
ORS = " "
} {
print "\n"
print $1,$2,$3,$4,$5;
for(i=6; i?NF; i+=2)
{
print $i "/" $(i+1)
}
}' temp | cut -f2- -d ' ' | tail -n+2 > newfile.ped
EDIT
Sorry, there was an error.
Replace the ? by < in the for loop and this should work.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I used your code and it worked fine till i tried to get the ped: the erroroutput was this:
"awk: line 3: syntax error at or near print"
any suggestions what might be the problem?
my ped-file contains about 300000 SNPs and 2800 Samples
Sorry, there was an error.
Replace the ? by < in the for loop and this should work.
Thanks! Your solution functions perfectly! I already had my .ped and .dat files but MACH didn't recognise it correctly and it would say the markers were not bi-allelic.
Mil gracias!!!