Hi, I got some monozygotic twins samples (and a technical replicate of the same sample sequences twice) and I am running a GATK GenotypeGVCFs
on their GATK HplotypeCaller
called variants (through single HaplotypeCaller
--> GenomicsDBImport
data aggregation --> GenotypeGVCFs
as per GATK joint calling guidelines). This last step adds a ExcessHet annotation that is later used in VQSR filtering but it specifies that:
If samples are known to be related, a pedigree file can be provided so that the calculation is only performed on founders and offspring are excluded.
I got no relative samples and only monozygotic twins offsprings couples, how do I specify that in a .ped
file? (I cannot specify that they are the same sample as they have different names in GenomicsDBImport
and if I just specify the same family ID it makes absolutely no difference with the version in which I do not provide .ped
file(tried))
Does anybody have a clue how to work this around?
Thank you very much in advance for any help!
Monozygotic twins would have the same FamilyID, FatherID and MotherID. However, given that you don't have the father or the mother in the dataset, I don't see the benefit of using pedigree based genotyping. You could pass unknown values for PID and MID and have a shared value for FID and see if that makes a difference.
I tried but it makes absolutely no difference with the version without
.ped
fileThen there's nothing you can gain adding pedigree information. Plus, I read your post again, and the exception seems to be for non-founders, which I'm guessing is not applicable to your dataset.