Entering edit mode
2.4 years ago
Pac314
▴
10
Should multi-allelic sites be removed from VCF files before performing IBD detection with unphased IBD detection tools such as truffle/IBIS?
that depends. are multiallelic sites irrelevant to your research question?
will they make implementation of IBD calculation harder? for some tools. but should you remove them? well depends on your research question.
if you are positive they dont add information to the analysis, you can drop them.
if they might add information to your analysis, you have to weigh how important they are versus how much they will complicate things. If you remove them, have to state that in any publication.
my general view is that bioinformatics should adapt to biology, biology should not be truncated for bioinformatics.
so i would favor finding an IBD metric that can account for these (and copy number variation!!!) not removing your own data!
Thank you for your reply. I am not interested in multi-alleles specifically but I am trying to identify IBD regions in familial variant data and using these regions to screen for rare causal variants. When performing IBD detection with IBIS, I receive multiple warnings about identical SNP positions pertaining to multiallelic sites, so was wondering whether to exclude these SNPs prior to running IBIS.