Question

Ibd Calculation For Small Familial Pedigrees

0

Entering edit mode

12.7 years ago

michealsmith ▴ 800

Available popular programs for IBD calculation from genotyping or sequence data include Plink, Beagle, Germline... I haven't read all of the papers, and totally newcomer to this field. It seems to me that most these programs are designed for large-scale individuals say thousands of samples in GWAS. I'm wondering if it'll be suitable for genotyping/sequence data for small family pedigrees with only say less than ten individuals?

For example, Beagle-fastIBD, it's using haplotype frequency, would this mean it's only suitable for large-scale data?

thanks

• 4.8k views

ADD COMMENT • link updated 2.5 years ago by Ram 45k • written 12.7 years ago by michealsmith ▴ 800

score 1 · Answer 1 · 2012-11-20

The IBD calculations may not be accurate if they involve allele or haplotype frequencies unless you have a large group, but I am not entirely sure how much this matters. With only a small pedigree you can assume that most the the IBS (identity by state) is by descent. Therefore, you can use a visual tool like SNPduo to look at pairwise comparisons between your individuals.

score 1 · Answer 2 · 2012-11-22

Hi

IBD estimation for small families can work very well IF the founders are genotyped. If they are typed, then you do not need frequencies from general population. Moreover, you need to take care about the fact that most (not all) the algorithms and programs make the assumption that there is linkage equilibrium between SNPs.

The programs you are listing are rather doing IBD for datasets where there are no explicit familial relationships - in fact I think most if not all do not use these relationships in their calculations. In this case, you need to estimate an average IBD on the whole population and you are better off with larger dataset because you can estimate allele frequencies. Which in turn improve a lot IBD estimation.

Programs like IBDLD can take, as input, training sets (typically 1000 Genomes individuals typed for the same SNPs as your families) to infer allele frequencies and LD and apply this to your data. This can be a solution (provided your 1000 G individuals are ethnically cloe enough of your families members. This IBDLD method can use both explicit and estimated familial relationships. I am not advertising this method, but find it to be pretty flexible.

If you would like a more detailed and elaborate answer, could you please describe more what you'd like to do ? Do you have genome wide data ? Do you want to use all the SNPs (so SNPs in LD) ? Do you want to estimate IBD within families or even between individuals from different families ...

Ram · Answer 3 · 2015-05-14

1

Entering edit mode

10.3 years ago

piper.below ▴ 10

https://primus.gs.washington.edu/primusweb/

PRIMUS might be helpful- it will use PCA to assign reference population allele frequencies before calculating IBD proportions for you in PLINK. It'll then reconstruct all possible pedigrees for you that match your data.

ADD COMMENT • link updated 2.5 years ago by Ram 45k • written 10.3 years ago by piper.below ▴ 10