IBD calculation indicates cryptic relatedness of 1 Individual with all others within a sequencing cohort
1
2
Entering edit mode
6.6 years ago
maegsul ▴ 170

I know it is not super ideal; but on my WES data of n=~350 individuals I am trying to observe any cryptic relatedness that could be present in the sequencing data. To do so, I first QC my WES data in PLINK with MAF > 0.01 and call rate > 98% filters, leaving ~200K high quality variants for analysis (again, not so ideal for assessing relatedness, but I think it still should give an idea more or less). Then I calculate IBD scores in PLINK with --genome option.

We already identified a first-degree relative pair with this approach (PI-HAT = 0.58, confirmed by fingerprinting [i.e. high similarity between the lengths of STR markers all over the genome, done using stock sample tubes] and clinical files), and new sequencing results indicates another pair (PI-HAT = 0.5). However; interesting thing is that, I have two samples that are somehow related to every other individual in the cohort. I have to indicate that the cohort is consisting of multiple centers all over the Europe, presumably unrelated, so this observation is indeed false-positive.

Sample 1 -> Related to 346 others with 0.25 < PI-HAT < 0.38 scores.
Sample 2 -> Related to 346 others with 0.12 < PI-HAT < 0.22 scores.

All other pairs are having PI-HAT < 0.1.

What could be the reason that we have such an observation? First thing comes to my mind is of course a possible contamination, but I wouldn't expect that since these samples were run in batches of 12, and their libraries were prepared independently as well.

TLDR: In my whole-exome sequencing data, I find one individual with cryptic relatedness to all other 346 participants; what could be the reason?

WES IBD Plink Relatedness • 2.5k views
ADD COMMENT
3
Entering edit mode
6.6 years ago

In my experience, this is usually contamination or a similar sequencing problem. I’d recheck the relatedness estimates with plink 2.0 —make-king-table, and then throw out (or resequence if possible) the samples if the pattern remains.

ADD COMMENT
1
Entering edit mode

Thank you for your reply! Actually I already checked it with plink2 --make-king-table too and I've got these:

Kinship score between siblings (confirmed) : 0.29

Kinship score between possible new pair: 0.19

Kinship score between Sample1 (with respect to my original post) and all others: varies between ~0.04 and ~0.08

Kinship score between Sample2 (with respect to my original post) and all others: <~0.04

Kinship score between Sample 1 and Sample 2: 0.1682! (I haven't mentioned but this was PI-HAT 0.38 before).

I am not familiar with this KING format; how would you comment on the table? Does it tell anything new? Pattern seems to be remaining.

ADD REPLY
0
Entering edit mode

Hi. Were you able to find out the issue that was causing this? I have a similar issue. I have a total of 5 plates joint called in GenomeStudio. 14 samples from 1 plate have a PIHAT of 0.3-0.5 with all samples in the cohort, and 4 samples from another plate give the same result. I'm trying to investigate the cause.

ADD REPLY

Login before adding your answer.

Traffic: 1771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6