In the GATK pipeline, it seems like I need to use a .ped file for CalculateGenotypePosteriors
and a .fam file for VariantsToBinaryPed
. What is the difference between .ped and .fam file? And how do I specify missing parents in both of them? Some options I've seen are zero (0
), NO_PARENTS
, -9
etc. I want to be sure about this because I don't want the tool to think that 0
, NO_PARENTS
etc is the character describing the parent.
My file looks like this now:
#family_id individual_id paternal_id maternal_id sex phenotype
20 20-01 m20 f20 1 1
20 20-02 m20 f20 1 1
20 20-03 m20 f20 1 1
21 21-01 m21 f21 1 1
21 21-02 m21 f21 1 1
21 21-03 m21 f21 1 1
20 m20 1 0
20 f20 2 0
21 m21 1 0
21 f21 2 0
You can add 0 for missing parents.
http://www.gwaspi.org/?page_id=145 Check link for more information.