I am trying to filter by allele frequency using bcftools view -q but the output does not give me any sites, all it gives me is the vcf header. I am using a cancer cell line and I am trying to use an allele frequency cut-off to make sure that there would be at least one copy per cell in the population. I chose 0.1 based on my lit review, does this seem reasonable.
Here is the command I used
bcftools view -q 0.1 "my_file.vcf"
Output was the vcf header. The whole output is pasted at the bottom of the page
Any idea how I can return AF > 0.1. and yes I did make sure that there were AF > 0.1 in my data.
##fileformat=VCFv4.0
##FILTER=<ID=PASS,Description="All filters passed">
##fileDate=20220429
##source=lofreq call --verbose --ref /cvmfs/data.galaxyproject.org/byhand/hg38/sam_index/hg38.fa --call-indels --min-cov 30 --max-depth 1000000 --use-orphan --min-bq 30 --min-alt-bq 30 --min-mq 20 --max-mq 255 --min-jq 0 --min-alt-jq 0 --def-alt-jq 0 --sig 0.01 --bonf dynamic --no-default-filter -r chr1:1-124478211 -o /jetstream/scratch0/main/jobs/42329427/_job_tmp/lofreq2_call_parallelp63o5c08/0.vcf.gz reads.bam
##reference=/cvmfs/data.galaxyproject.org/byhand/hg38/sam_index/hg38.fa
##INFO=<ID=DP,Number=1,Type=Integer,Description="Raw Depth">
##INFO=<ID=AF,Number=1,Type=Float,Description="Allele Frequency">
##INFO=<ID=SB,Number=1,Type=Integer,Description="Phred-scaled strand bias at this position">
##INFO=<ID=DP4,Number=4,Type=Integer,Description="Counts for ref-forward bases, ref-reverse, alt-forward and alt-reverse bases">
##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the variant is an INDEL.">
##INFO=<ID=CONSVAR,Number=0,Type=Flag,Description="Indicates that the variant is a consensus variant (as opposed to a low frequency variant).">
##INFO=<ID=HRUN,Number=1,Type=Integer,Description="Homopolymer length to the right of report indel position">
##FILTER=<ID=min_snvqual_78,Description="Minimum SNV Quality (Phred) 78">
##FILTER=<ID=min_indelqual_63,Description="Minimum Indel Quality (Phred) 63">
##FILTER=<ID=min_dp_10,Description="Minimum Coverage 10">
##FILTER=<ID=sb_fdr,Description="Strand-Bias Multiple Testing Correction: fdr corr. pvalue > 0.001000">
##FILTER=<ID=min_snvqual_93,Description="Minimum SNV Quality (Phred) 93">
##FILTER=<ID=min_indelqual_78,Description="Minimum Indel Quality (Phred) 78">
##SnpEffVersion="4.3t (build 2017-11-24 10:18), by Pablo Cingolani"
##SnpEffCmd="SnpEff -i vcf -o vcf -stats /corral4/main/objects/7/3/3/dataset_733993dd-8ceb-4959-b46f-53737d762bcd.dat -csvStats /corral4/main/objects/e/b/8/dataset_eb87c49d-8609-46f0-b181-4fcdf08aed44.dat GRCh38.p7.RefSeq /corral4/main/objects/3/f/7/dataset_3f74fe07-e0ec-4ce9-bf2a-89b8806d8ca6.dat "
##INFO=<ID=ANN,Number=.,Type=String,Description="Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO'">
##INFO=<ID=LOF,Number=.,Type=String,Description="Predicted loss of function effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected'">
##INFO=<ID=NMD,Number=.,Type=String,Description="Predicted nonsense mediated decay effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected'">
##SnpSiftVersion="SnpSift 4.3t (build 2017-11-24 10:18), by Pablo Cingolani"
##SnpSiftCmd="SnpSift Filter -f /corral4/main/objects/6/d/a/dataset_6da04259-64b4-4853-82a2-43ce88b651db.dat -e /corral4/main/jobs/042/523/42523967/configs/tmpo4_pwef0 --inverse"
##FILTER=<ID=SnpSift,Description="SnpSift 4.3t (build 2017-11-24 10:18), by Pablo Cingolani, Expression used: ( EFF[*].IMPACT = 'LOW')">
##SnpSiftCmd="SnpSift Filter -f /corral4/main/objects/8/3/5/dataset_835a16c9-7404-4049-a838-f5867ab3d8d3.dat -e /corral4/main/jobs/042/523/42523974/configs/tmpcitxn7nn --inverse"
##contig=<ID=chr1>
##contig=<ID=chr2>
##contig=<ID=chr3>
##contig=<ID=chr4>
##contig=<ID=chr5>
##contig=<ID=chr6>
##contig=<ID=chr7>
##contig=<ID=chr8>
##contig=<ID=chr9>
##contig=<ID=chr10>
##contig=<ID=chr11>
##contig=<ID=chr12>
##contig=<ID=chr13>
##contig=<ID=chr14>
##contig=<ID=chr15>
##contig=<ID=chr16>
##contig=<ID=chr17>
##contig=<ID=chr18>
##contig=<ID=chr19>
##contig=<ID=chr20>
##contig=<ID=chr21>
##contig=<ID=chr22>
##contig=<ID=chrX>
##contig=<ID=chrY>
##bcftools_isecVersion=1.11+htslib-1.11
##bcftools_isecCommand=isec -p dir -n -1 -c all /u/home/k/kshaffma/vcf_only_med_and_high/revert_only_med_and_high.vcf.gz /u/home/k/kshaffma/vcf_only_med_and_high/parent_only_med_and_high.vcf.gz; Date=Wed May 25 13:16:57 2022
##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
##bcftools_viewVersion=1.11+htslib-1.11
##bcftools_viewCommand=view -q 0.1 /u/home/k/kshaffma/Uniq/parents_uniq_with_annotations.vcf; Date=Mon Jun 6 16:53:51 2022
#CHROM POS ID REF ALT QUAL FILTER INFO
Hi, thank you. This fixed my problem. So is what you are saying @cfos4698 that the -q tag is incompatible with the loqreq alignment tool?
Feel free to mark my answer as accepted ;)
I think
-q
is just designed with the wholebcftools
ecosystem in mind, i.e. that the output ofbcftools call
will be piped tobcftools view
. The-q
flag doesn't seem to work well withlofreq
's default output from my experiences, and might not work with other tools very well either, unless you reformat the INFO columns to mimicbcftools call
's output.