I have a multisample VCF, ex. of a line:
1 14464 . A T . . ECNT=1;PON;DP=67;MBQ=0,36;MFRL=0,278;MMQ=60,28;MPOS=23;POPAF=0.69;TLOD=29.47 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:0,17:0.947:17:0,9:0,8:0,0,14,3 ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. 0/1:1,25:0.929:26:1,14:0,10:1,0,17,8 ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. 0/1:1,12:0.866:13:0,5:1,6:1,0,5,7 ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. 0/1:0,9:0.912:9:0,4:0,5:0,0,7,2 ./.:.:.:.:.:.:.
What I need is to filter samples based on their Altered AD removing samples with Alt AD < 10. In the example above this would mean to remove the 4th available sample (Alt_AD 9) keeping the first 3, getting something like this:
1 14464 . A T . . ECNT=1;PON;DP=67;MBQ=0,36;MFRL=0,278;MMQ=60,28;MPOS=23;POPAF=0.69;TLOD=29.47 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:0,17:0.947:17:0,9:0,8:0,0,14,3 ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. 0/1:1,25:0.929:26:1,14:0,10:1,0,17,8 ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. 0/1:1,12:0.866:13:0,5:1,6:1,0,5,7 ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:. ./.:.:.:.:.:.:.
Is there any available tool for that? I saw vcffilterjs based on this post but it works differently and removes the whole line if none is met and keeps it if at least one pass the filter.
Thank a lot in advance for any help!
how could you remove one or more genotype while keeping the structure of the VCF ?
yes I mean, is there no way to remove genotype entries keeping the structure of the VCF (eliminating if no entries remain) ?
well you can reset the genotype to './.' but you cannot remove a genotype. The VCF header with the samples' name would be meaningless + broken.
of course not removing, sorry, I meant to set it to
./.
Is there any tool for that?