Hello!
I have a regular multi-sample VCF from which I am looking to find individuals that have >1 HET call in the same gene. The vcf is chunked so relatively small in size (with the gene of interest in the chunk)/
Does anyone have a one liner for such a task with bcftools or even R?
Many thanks
VCF format does not work that way. How did you get to this from a real valid VCF?
Yup - its a multi sample vcf chunked by region of the genome.
Yes to what, exactly?
That is not the same thing as saying "multiple individuals merged into one". Please be clear - is it a multi-sample VCF or some sort of mutated file with multiple GTs per sample?
Dear RamRS,
It is a multi-sample vcf. Whilst it may contain mutations I can assure you it has not been mutated.
Thank you for clarifying that. Your initial statement "I have a vcf with multiple individuals merged into one across a single gene" is quite confusing as individuals are not "merged", they are merely part of the VCF, and I'm also assuming loci are not merged (the "across a single gene" part). What you have is a regular VCF from which you are looking to find individuals that have >1 HET call in the same gene.
Correct! Apologies if that was too vague. I have amended by question accordingly.