I have 8 supposedly unrelated individuals with a shared disease (mode of inheritance unknown). I am concered that I am getting false positives while gene hunting because of ethnic stratification. The individuals self reported ethnicity. I have exome data and understand that this isn't ideal for analyzing Shared Genomic Segments (SGS). Can anyone suggest some analyses I could do while we wait for chip genotyping to come back?
@Zev, I'm assuming you have found some nonsynonymous variants, but suspect they may really be undocumented benign polymorphisms because your subject(s) may be from ethnic groups not well represented in the publicly available control exome data?
I think you are asking if there is a technique to determine whether these variants are within blocks that are identical by descent? If they are, then they are likely to be benign polymorphisms common to that ethnic population, and not likely to be disease-causing variants.
I've had to deal with this situation in a pedigree, and found this paper helpful:
Browning & Browning, 2011
However, this technique is written for SNP array data, and I'm not sure if one could feed it a modified vcf file. Their software package is posted here. Maybe if you contact the authors, they will have had experience doing this analysis with exome sequence data.
Do you have parental exome data as well? You say the mode of inheritance is unknown. But you may be able to make a reasonable guess. Are there similar diseases where the mode of inheritance is known, and your phenotype is suspected of being related in some way? Often similar genetic diseases will be caused by mutations with similar inheritance patterns, and you can make a hypothesis by analogy. Has there EVER been a reported recurrence within a family? If not, it well may be new dominant/sporadic inheritance -- and you will need parental exome data to filter the variants. This may end up being the critical step in eliminating false positives (whether you solve the shared genomic segments issue or not) -- those variants shared by proband and a parent in a sporadic pedigree are likely false positives.
Great Q. I am very much interested in reading the answers...