Selecting pedigree informative SNPs
1
2
Entering edit mode
5.9 years ago
efountain ▴ 20

Hi, I have a .vcf containing 1 million SNPs for a subset of a population (80 individuals) and what I would like to do is select SNPs that are informative for building pedigree and then redesign a probe set to target those SNPs for the rest of the population.

I am not quite sure how to determine which SNPs out of the 1 mil are actually kinship informative. I do know parent-offspring and full/half sib relationships for this subset of individuals but will need to be able to build a pedigree out to 4th degree relatives for the entire population. Does anyone have any suggestions on a good approach for selecting these SNPs? Would it be something similar to how I would select ancestry informative (which I have already mined out of this SNP dataset)?

Thanks!

SNP • 1.2k views
ADD COMMENT
4
Entering edit mode
5.9 years ago
Vitis ★ 2.6k

I'd suggest LD pruning/clumping to retain the unlinked SNPs (PLINK?), followed by identifying IBD (Identity-By-Descent) using PLINK. Pruning essentially finds representative SNPs in LD blocks to capture genotype information for the entire blocks and reduces the amount of variant data to analyze.

https://privefl.github.io/bigsnpr/articles/pruning-vs-clumping.html

https://www.cog-genomics.org/plink/1.9/ld#indep

https://www.cog-genomics.org/plink2/ibd

This would result in relatedness (kinship) measures. See this:

How Large Would Inbreeding Coefficient Be To Be Anomalous?

But sometimes the relatedness measure could be a bit different from pedigree/parentage relationships you're after, as relatedness measures are also affected by population histories like genetic drift in small populations and/or bottlenecks.

ADD COMMENT
0
Entering edit mode

Thank you, this is very helpful. I have the .ped and .map files already and will start to learn to do this in PLINK today.

One of my concerns is exactly what you stated about the relatedness measures being different. We know we are dealing with a high level of inbreeding. It will be interesting to see what we get from the PLINK analysis.

ADD REPLY

Login before adding your answer.

Traffic: 1953 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6