Entering edit mode
4.9 years ago
Stephane Plaisance
▴
460
below is a piece of my VCF where I try to extract rows with SUPPORT>10 but fail to specify such filter.
I tried: bcftools filter -sFilterName -e 'INFO/SUPPORT>10' variants.vcf
(typo edited)
result is no filtering with all the input going through
I do not find a page with working examples
bcftools 1.9
##fileformat=VCFv4.2
##fileDate=2020-06-26|04:40:51PM|CEST|+0200
##source=SVIM-v1.4.0
##contig=<ID=CA_Cp,length=155185>
##ALT=<ID=DEL,Description="Deletion">
##ALT=<ID=INV,Description="Inversion">
##ALT=<ID=DUP,Description="Duplication">
##ALT=<ID=DUP:TANDEM,Description="Tandem Duplication">
##ALT=<ID=DUP:INT,Description="Interspersed Duplication">
##ALT=<ID=INS,Description="Insertion">
##ALT=<ID=BND,Description="Breakend">
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant">
##INFO=<ID=CUTPASTE,Number=0,Type=Flag,Description="Genomic origin of interspersed duplication seems to be deleted">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record">
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="Difference in length between REF and ALT alleles">
##INFO=<ID=SUPPORT,Number=1,Type=Integer,Description="Number of reads supporting this variant">
##INFO=<ID=STD_SPAN,Number=1,Type=Float,Description="Standard deviation in span of merged SV signatures">
##INFO=<ID=STD_POS,Number=1,Type=Float,Description="Standard deviation in position of merged SV signatures">
##INFO=<ID=STD_POS1,Number=1,Type=Float,Description="Standard deviation of breakend 1 position">
##INFO=<ID=STD_POS2,Number=1,Type=Float,Description="Standard deviation of breakend 2 position">
##FILTER=<ID=hom_ref,Description="Genotype is homozygous reference">
##FILTER=<ID=not_fully_covered,Description="Tandem duplication is not fully covered by a single read">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read depth">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Read depth for each allele">
##FORMAT=<ID=CN,Number=1,Type=Integer,Description="Copy number of tandem duplication (e.g. 2 for one additional copy)">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample
CA_Cp 0 svim.BND.1 N ]CA_Cp:150583]N 1 PASS SVTYPE=BND;SUPPORT=1;STD_POS1=.;STD_POS2=. GT:DP:AD ./.:.:.,.
CA_Cp 2 svim.BND.2 N ]CA_Cp:152088]N 3 PASS SVTYPE=BND;SUPPORT=3;STD_POS1=1;STD_POS2=500 GT:DP:AD ./.:.:.,.
CA_Cp 3 svim.INV.1 N <INV> 0 PASS SVTYPE=INV;END=85158;SUPPORT=95;STD_SPAN=3.49;STD_POS=1.44 GT:DP:AD ./.:.:.,.
CA_Cp 3 svim.BND.3 N ]CA_Cp:153260]N 4 PASS SVTYPE=BND;SUPPORT=4;STD_POS1=2;STD_POS2=418 GT:DP:AD ./.:.:.,.
CA_Cp 7 svim.BND.4 N ]CA_Cp:154652]N 29 PASS SVTYPE=BND;SUPPORT=27;STD_POS1=26;STD_POS2=309 GT:DP:AD ./.:.:.,.
CA_Cp 7 svim.BND.5 N ]CA_Cp:155004]N 89 PASS SVTYPE=BND;SUPPORT=87;STD_POS1=17;STD_POS2=268 GT:DP:AD ./.:.:.,.
CA_Cp 7 svim.DUP_TANDEM.1 N <DUP:TANDEM> 1 not_fully_covered SVTYPE=DUP:TANDEM;END=87179;SVLEN=87172;SUPPORT=1;STD_SPAN=.;STD_POS=. GT:CN:DP:AD
./.:2:.:.,.
CA_Cp 8 svim.BND.6 N ]CA_Cp:154580]N 27 PASS SVTYPE=BND;SUPPORT=25;STD_POS1=26;STD_POS2=309 GT:DP:AD ./.:.:.,.
CA_Cp 8 svim.BND.7 N ]CA_Cp:155166]N 94 PASS SVTYPE=BND;SUPPORT=985;STD_POS1=17;STD_POS2=88 GT:DP:AD ./.:.:.,.
CA_Cp 11 svim.BND.8 N ]CA_Cp:154272]N 15 PASS SVTYPE=BND;SUPPORT=14;STD_POS1=35;STD_POS2=259 GT:DP:AD ./.:.:.,.
CA_Cp 13 svim.BND.9 N ]CA_Cp:154140]N 13 PASS SVTYPE=BND;SUPPORT=12;STD_POS1=38;STD_POS2=326 GT:DP:AD ./.:.:.,.
CA_Cp 17 svim.BND.10 N ]CA_Cp:153931]N 9 PASS SVTYPE=BND;SUPPORT=9;STD_POS1=44;STD_POS2=367 GT:DP:AD ./.:.:.,.
CA_Cp 122 svim.DEL.1 N <DEL> 1 PASS SVTYPE=DEL;END=165;SVLEN=-43;SUPPORT=1;STD_SPAN=.;STD_POS=. GT:DP:AD ./.:.:.,.
CA_Cp 368 svim.DEL.2 N <DEL> 1 PASS SVTYPE=DEL;END=424;SVLEN=-56;SUPPORT=1;STD_SPAN=.;STD_POS=. GT:DP:AD ./.:.:.,.
CA_Cp 699 svim.BND.11 N ]CA_Cp:153209]N 1 PASS SVTYPE=BND;SUPPORT=1;STD_POS1=.;STD_POS2=. GT:DP:AD ./.:.:.,.
CA_Cp 910 svim.DEL.3 N <DEL> 1 PASS SVTYPE=DEL;END=970;SVLEN=-60;SUPPORT=1;STD_SPAN=.;STD_POS=. GT:DP:AD ./.:.:.,.
CA_Cp 1346 svim.DEL.4 N <DEL> 1 PASS SVTYPE=DEL;END=1397;SVLEN=-51;SUPPORT=1;STD_SPAN=.;STD_POS=. GT:DP:AD ./.:.:.,.
CA_Cp 1547 svim.INS.1 N <INS> 1 PASS SVTYPE=INS;END=1547;SVLEN=56;SUPPORT=1;STD_SPAN=.;STD_POS=. GT:DP:AD ./.:.:.,.
Isn't it
INFO/
and notINFO\
? Maybe that's why the filter doesn't work well.You are right, only '/' is valid but does not work for me Should the VCF be bgzipped and indexed or does this normally work on plain text?
One of these bad days, I used -e (exclude) instead of -i (include) Sorry about this! It now works (of course)
Glad you found the problem. In the future, please use
Add Comment
when you're adding a comment orAdd Reply
when you're replying to a comment. Only useAdd Answer
when you're answering the top-level question.