Question

PED&MAP in Plink

1

Entering edit mode

9.8 years ago

shangzhener ▴ 10

Hi, thanks for taking the time. I am learning Plink these days and find one question. I download the example(called hapmap) in the website and find the number of genotype data in the PED is different from the number of SNP in the MAP. I want to know if anyone knows the reason. Thanks a lot.

SNP software error • 4.1k views

ADD COMMENT • link updated 2.7 years ago by Ram 45k • written 9.8 years ago by shangzhener ▴ 10

Ram · Accepted Answer · 2015-09-01

2

Entering edit mode

9.8 years ago

Philipp Bayer 8.8k

How did you count the genotype data in the PED?

The PED has SNPs for columns and individuals for rows (2 columns per SNP! I.e., A A instead of AA), while the MAP has SNPs as rows. Furthermore, the first few columns in a PED file are not genotypes - they are

Family ID
Individual ID
Paternal ID
Maternal ID
Sex (1=male; 2=female; other=unknown)
Phenotype

Source

Therefore,

wc -l your_map_file.map

should be identical to

awk '{print (NF - 6)/2}' your_ped_file.ped | head -1

The first wc command prints the number of lines in the map file (= number of SNPs in map), the second prints the number of columns (NF) minus the 6 additional IDs, divided by 2 since you always have two columns per SNP.

If your numbers are actually different, try re-downloading the files.

ADD COMMENT • link updated 2.7 years ago by Ram 45k • written 9.8 years ago by Philipp Bayer 8.8k

0

Entering edit mode

Thank you so much for your answering. I read it and find my understanding before was not wrong. However, when I do it again I find I make a stupid mistake. I ignore there are so many colunms in the excel and just read the first few columns. Thanks again.

And I wonder if you can help me for anther question. I am using another uncommon software called AML.The input file needs SNPs information. And there are two codes.

0=missing, 1 = common homozygote, 2 = heterozygote, 3 = rare homozygote.
ACGT format (N = missing).

I only know the first as follows:

patientid,status,snp1,snp2,snp3
p1,0,2,2,2
p2,0,2,2,3
p3,0,2,2,1

Can you give me some suggest for the second format?

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 9.8 years ago by shangzhener ▴ 10

0

Entering edit mode

Sorry I never heard of that software, and googling for it doesn't give me any results; try opening another question, maybe someone else knows?

ADD REPLY • link 9.8 years ago by Philipp Bayer 8.8k

0

Entering edit mode

OK, thanks.

ADD REPLY • link updated 2.7 years ago by Ram 45k • written 9.8 years ago by shangzhener ▴ 10