Genotype frequency per sample
1
0
Entering edit mode
6.4 years ago
HG ▴ 30

Hi everyone, I have a plink file and I want to get the distribution of calls (AA/Aa/aa) for each sample. How can I get to this? Any help would be appreciated.

SNP • 1.2k views
ADD COMMENT
0
Entering edit mode

Hello,

could you please post an example of your file?

fin swimmer

ADD REPLY
0
Entering edit mode

Hi, Plink files (my input files) are .map and .ped files:

.ped file is as follows:

A text file with the following fields:

  1. Family ID
  2. Individual ID
  3. calls for each SNP

.map file:

A text file with no header file, and one line per variant with the following 3-4 fields:

  1. Chromosome code.
  2. Variant identifier
  3. Position in morgans or centimorgans (optional; also safe to use dummy value of '0')
  4. Base-pair coordinate
ADD REPLY
1
Entering edit mode
6.4 years ago

"plink --file ... --het" will give you the number of Aa calls for each sample.

If you also need AA/aa, you'd need to define which alleles are A and which ones are a. Once you have, you can merge in a synthetic sample with all aa calls, use --merge to merge that sample with your real dataset, and then run "--genome full". Then look at the lines of the .genome file which include your artificial all-aa sample; the IBS0 and IBS2 columns of those lines will give you the additional counts you need.

ADD COMMENT
0
Entering edit mode

Thanks for your comments. Yes, I need AA/Aa/aa. --het will return a text file with the following columns: 1. FamilyID, 2.Within-family ID, 3. Observed number of homozygotes, 4. Expected number of homozygotes, 5. Number of non-missing autosomal genotypes, 6. Method-of-moments F coefficient estimate. No information about heterozygotes (Aa)! Also, how can I perform what you explained in the second part of your comments? Any scripts in R? Thanks

ADD REPLY

Login before adding your answer.

Traffic: 1673 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6