Question

KING struggle: Relatedness

0

Entering edit mode

3.4 years ago

German.M.Demidov ★ 2.9k

I struggle to infer relationships in a dataset of 20K exomes from tens of kits.

At first I found a well-covered union of regions - check.

Second, I performed everything to merge 20K VCFs into one. Removed indels and multi-allelic variants. Check.

Still, when I run KING with "kinship" option, it finds a lot of relatives. But I need KING with --related option. With IBD2 and IBD1. And here I get 0 first degree and 0 second degree relatives (still some MZ pairs).

Which basically says that I can't infer IBD-segments and it is (I think) due to QC failed samples.

Is there a procedure for an automated QC here? Or I need to make a PCA and do "remove outliers - build PCA again - remove outliers - iterate until no outliers" procedure? Is there any other reason why KING may behave so nasty with me?

Some data to give an idea (toy dataset of 3K exomes):

King with --related:

Source        MZ      PO      FS      2nd     3rd     OTHER
  ===========================================================
  Pedigree      0       0       0       0       0       5512860
  Inference     30      0       0       3       8       5512819

King with --kinship:

  Source        MZ      PO      FS      2nd     3rd     OTHER
  ===========================================================
  Pedigree      0       0       0       0       0       5512860
  Inference     30      58      463     19      895     5511395

I was able to perform relatedness inference with 10K dataset (subset of this one) 1 year ago. I have no idea what is different now (except now no one filtered QC failed samples) - I simply execute the same makefile.

king relatedness • 766 views

ADD COMMENT • link 3.4 years ago by German.M.Demidov ★ 2.9k