Hi all,
I want to filter a multi sample vcf based on Allele Balance (AB) value of heterozygous calls only. The purpose is to keep those variants which have AB value between 0.25 and 0.75 and are heterozygous. In addition, homozygous calls will also be kept in vcf, and no filtering needs to be done for those variants based on any tag.
I don't know how can I filter VCF based on above-mentioned criteria.
For the sake of brevity, I just posted an example line from my input VCF.
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1 Sample2
> chr1 87083 . G T 69.7403 . AB=0.235294;ABP=13.3567;AC=1; GT:DP:RO:QR:AO:QA:GL 0/0:16:16:582:0:0:0,-4.81648,-52.7438 0/0:26:26:953:0:0:0,-7.82678,-86.1365
I'll be thankful for any help in this regard.
Aisha
Try GATK variant filtration walker and example to filter by AB is provided in manual page (https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_tools_walkers_filters_VariantFiltration.php) You can use snpsift or bcftools for filtering format field. What is the criteria for homozygous and heterozygous calls in your vcf?
Yeah. I've tried bcf filtering for Allele-Balance (AB). But I'm unable to get the desired output as I mentioned above. Calls having 0/0 or 1/1 are homozygous while those having GT 0/1 or 1/0 are heterozygous.
duplicate: Allele Depth (AD) / Allele Balance (AB) Filtering in GATK 4Hi @Pierre!
In that particular post, you mentioned, solution is provided based on AD not AB. That's why I had to open another question.
I see. Your question is not clear to me. You have a AB value in the INFO column but I don't understand why you say: " homozygous calls will also be kept in vcf"