Question

IBD calculation indicates cryptic relatedness of 1 Individual with all others within a sequencing cohort

2

Entering edit mode

6.6 years ago

maegsul ▴ 170

I know it is not super ideal; but on my WES data of n=~350 individuals I am trying to observe any cryptic relatedness that could be present in the sequencing data. To do so, I first QC my WES data in PLINK with MAF > 0.01 and call rate > 98% filters, leaving ~200K high quality variants for analysis (again, not so ideal for assessing relatedness, but I think it still should give an idea more or less). Then I calculate IBD scores in PLINK with --genome option.

We already identified a first-degree relative pair with this approach (PI-HAT = 0.58, confirmed by fingerprinting [i.e. high similarity between the lengths of STR markers all over the genome, done using stock sample tubes] and clinical files), and new sequencing results indicates another pair (PI-HAT = 0.5). However; interesting thing is that, I have two samples that are somehow related to every other individual in the cohort. I have to indicate that the cohort is consisting of multiple centers all over the Europe, presumably unrelated, so this observation is indeed false-positive.

Sample 1 -> Related to 346 others with 0.25 < PI-HAT < 0.38 scores.
Sample 2 -> Related to 346 others with 0.12 < PI-HAT < 0.22 scores.

All other pairs are having PI-HAT < 0.1.

What could be the reason that we have such an observation? First thing comes to my mind is of course a possible contamination, but I wouldn't expect that since these samples were run in batches of 12, and their libraries were prepared independently as well.

TLDR: In my whole-exome sequencing data, I find one individual with cryptic relatedness to all other 346 participants; what could be the reason?

WES IBD Plink Relatedness • 2.5k views

ADD COMMENT • link updated 6.6 years ago by chrchang523 11k • written 6.6 years ago by maegsul ▴ 170

score 3 · Answer 1 · 2018-05-08

3

Entering edit mode

6.6 years ago

chrchang523 11k

In my experience, this is usually contamination or a similar sequencing problem. I’d recheck the relatedness estimates with plink 2.0 —make-king-table, and then throw out (or resequence if possible) the samples if the pattern remains.

ADD COMMENT • link 6.6 years ago by chrchang523 11k

1

Entering edit mode

Thank you for your reply! Actually I already checked it with plink2 --make-king-table too and I've got these:

Kinship score between siblings (confirmed) : 0.29

Kinship score between possible new pair: 0.19

Kinship score between Sample1 (with respect to my original post) and all others: varies between ~0.04 and ~0.08

Kinship score between Sample2 (with respect to my original post) and all others: <~0.04

Kinship score between Sample 1 and Sample 2: 0.1682! (I haven't mentioned but this was PI-HAT 0.38 before).

I am not familiar with this KING format; how would you comment on the table? Does it tell anything new? Pattern seems to be remaining.

ADD REPLY • link 6.6 years ago by maegsul ▴ 170

0

Entering edit mode

Hi. Were you able to find out the issue that was causing this? I have a similar issue. I have a total of 5 plates joint called in GenomeStudio. 14 samples from 1 plate have a PIHAT of 0.3-0.5 with all samples in the cohort, and 4 samples from another plate give the same result. I'm trying to investigate the cause.

ADD REPLY • link 3.8 years ago by samreen.jasvi • 0