Entering edit mode
9.4 years ago
devenvyas
▴
760
I downloaded a data set in Eigensoft's PACKEDANCESTRYMAP format (http://genetics.med.harvard.edu/reich/Reich_Lab/Datasets_files/EuropeFullyPublic.tar.gz). I used convertf to convert the files to map/ped files.
The files used multiple spaces instead of tabs, so I did some find/replaces in shell to get the files into proper format.
On each line of the map file, looks like this
1 Affx-13943225 0.020130 752566 G A
1 Affx-14804912 0.022518 842013 T G
1 Affx-15453076 0.024116 891021 G A
1 Affx-15485777 0.024457 903426 C T
1 Affx-15871758 0.025727 949654 A G
1 Affx-3942202 0.026288 1018704 G A
1 Affx-50004516 0.026665 1045331 G A
1 Affx-3979904 0.026674 1048955 A G
1 Affx-3995844 0.026711 1061166 C T
1 Affx-4055225 0.028311 1108637 G A
The last two columns are the two alleles at each site.
1 SA062 0 0 1 1 1 1 4 4 1 1 2 4 1 3 3 3 3 3 1 1 4 4 3 3 3 3 2 2 4 4 3 3 2 2 2 2 1 1 2 4 2 2 3 3 3 3 4 4 4 4 3 3 1 3 1 1 4 4 2 2 3 3 3 3 2 2 3 3 2 2 3 3 1 1 3 3 2 2 3 3 3 3 1 1 4 4 2 4 2 2 3 3 1 1 3 3 3 3 2 2 2 2 1 1 3 3 3
That is a bit of the first line of the ped file.
I wish to convert the data into a standard ACGT-coded Plink dataset. I was wondering if you know how this can be implemented.
--make-bed --alleleACGT
http://pngu.mgh.harvard.edu/~purcell/plink/dataman.shtml
Getting the files into Plink is the first problem though
When I tried
The result was
Any idea how to get Plink to accept the map?
A map file do not have allelic information.
You could:
1) Remove the alleles from the map file
2) rewrite the file in binary format:
--make-bed
3) change the alleles in the bim file or use the
--update-alleles
command