Hi all,
I have a dataset of around 20,000 exomes and I need a quick and efficient tool to look for relatedness in all of my samples. I have tried "relatedness2" from vcftools but it is very slow. It's also "pair-wise" and so not very efficient.
We use we use multi-vcf.gz files. Does anyone know of a (possibly) C++ based software to do this? According to 23andme, only around 1000 SNV's are need to efficiently find related samples. We are looking for a tool to allow us to find any samples that are related up to second degree and also unknown duplicate samples in one run (if possible). I have done a pretty thorough search but not found anything.
Can anyone help?
Many thanks in advance
Hi Shicheng Guo, Where can I get these tag-SNPs for both WES and WGS? I presume it would be based on LD scores < 0.2 or something like that? Thanks for your help.