Hello, I have VEP annotated vcf files with following content:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT file
chr1 183937 . G A 58.9 PASS CSQ=||||||||||||MODIFIER|FO538757.1|ENSG00000279928|ENST00000624431|unprocessed_pseudogene||4/4|||||;AC=1;AN=2 GT:GQ:DP:AD:VAF:PL 0/1:51:26:15,11:0.423077:58,0,51
chr1 601436 . C T 4.9 PASS CSQ=||||||||||||MODIFIER|AL669831.3|ENSG00000230021|ENST00000634337|processed_transcript|4/5||404||||,||||||||||||MODIFIER|AL669831.3|ENSG00000230021|ENST00000634833|processed_transcript|3/6||317||||;AC=1;AN=2 GT:GQ:DP:AD:VAF:PL 0/1:5:26:19,7:0.269231:3,0,17
I would like to filter out protein coding variants, but get following errors:
bcftools view -f "protein_coding" file > out
[E::bcf_write] Broken VCF record, the number of columns at chrX:152737049 does not match the number of samples (0 vs 1)
[main_vcfview] Error: cannot write to (null)
bcftools filter -i 'BIOTYPE="protein_coding"' file > aaa
[filter.c:2491 filters_init1] Error: the tag "BIOTYPE" is not defined in the VCF header
How should I filter such variants, if the field is in CSQ field between pipes?
Thank you!
Hello @storm1907, you asked many of questions in this community, which is totally fine. Though, almost none of the answers to these questions have received any upvotes or toggled an answer as accepted. Please take the time to acknowledge the effort the users have invested by upvoting helpful answers and comments. If an answer solved the issue please accept it.