GATK: .ped vs .fam and missing values
1
0
Entering edit mode
6.7 years ago
firestar ★ 1.6k

In the GATK pipeline, it seems like I need to use a .ped file for CalculateGenotypePosteriors and a .fam file for VariantsToBinaryPed. What is the difference between .ped and .fam file? And how do I specify missing parents in both of them? Some options I've seen are zero (0), NO_PARENTS, -9 etc. I want to be sure about this because I don't want the tool to think that 0, NO_PARENTS etc is the character describing the parent.

My file looks like this now:

#family_id      individual_id   paternal_id     maternal_id     sex     phenotype 
20  20-01  m20  f20  1  1
20  20-02  m20  f20  1  1
20  20-03  m20  f20  1  1
21  21-01  m21  f21  1  1
21  21-02  m21  f21  1  1
21  21-03  m21  f21  1  1
20  m20              1  0
20  f20              2  0
21  m21              1  0
21  f21              2  0
gatk SNP variant-calling • 2.1k views
ADD COMMENT
0
Entering edit mode

You can add 0 for missing parents.
http://www.gwaspi.org/?page_id=145 Check link for more information.

ADD REPLY
1
Entering edit mode
6.7 years ago

The PED and FAM file formats come from the eminent program PLINK.

For a description on PED fies, including information on how to encode missing values, please go here: PED files (note that the binary version of a PED file is called BED)

For a description on FAM files, see here: .fam (PLINK sample information file)

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 1754 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6