I'm creating a .ped file for Merlin to do linkage analysis for SNParray data.
I have a family of 12 members, but for some siblings, I only have one parent available, so in the .ped file, the father ID might be '8', but the mother id '0' for unknown.
When I do this, pedstats is giving me an error: Parent named 0 for Person 5 in Family 1 is missing
You have to include an entry for each parent in the Ped file even if they are missing. Also, even if a parent is missing you should give them a non-zero ID if the other parent isn't missing. So if the parent (father) is missing from family 1 it currently looks like this:
Family1 Sib1 Father0 Mother8 1
And you need to change it such that the missing father for the group still gets a unique ID:
Family1 Sib1 Father7 Mother8 1
In addition, you need to add the father to your pedigree file:
Family1 Father7 0 0 1
Notice that both parents are missing for Father7 so you can set both to 0 and there isn't the expectation that they will be listed in the Ped file somewhere.
But my .ped file, used as input for merlin, also contains the marker genotypes (from the SNP-array). But I have no data for Father7, so can I then just add the first 5 columns and leave the rest empty?
You can provide a "linking file" that lists the full set of subjects like mentioned above with the Ped file that has your genotypes. In merlin you would do this by separating them with a comma:
ADD REPLY
• link
updated 4.9 years ago by
Ram
44k
•
written 8.9 years ago by
dhibar
▴
40
0
Entering edit mode
I'm sorry but that's not completely clear for me.
So the pheno.ped should be my original file with the individuals for whom I have marker data?
And then the familystructure.ped should be a .ped file with only 5 columns in which there are lines for all individuals even those without marker data?
pheno.dat is then my original .dat file. But what needs to be inside familystructure.dat?
But my .ped file, used as input for merlin, also contains the marker genotypes (from the SNP-array). But I have no data for Father7, so can I then just add the first 5 columns and leave the rest empty?
You can provide a "linking file" that lists the full set of subjects like mentioned above with the Ped file that has your genotypes. In merlin you would do this by separating them with a comma:
I'm sorry but that's not completely clear for me.
So the pheno.ped should be my original file with the individuals for whom I have marker data?
And then the familystructure.ped should be a .ped file with only 5 columns in which there are lines for all individuals even those without marker data?
pheno.dat is then my original .dat file. But what needs to be inside familystructure.dat?
Problem solved:
pheno.ped = all genotyped individuals (6 columns + marker data)
familystructure.ped = all individuals (6 columns)
familystructure.dat = only one line, with disease condition (A condition)