How does FreeBayes evaluates variants that did not meet the --min-alternate-count (-C) and --min-alternate-fraction (-F) criteria?
1
1
Entering edit mode
9.5 years ago
Joanne Lim ▴ 20

Hi all,

I am currently using FreeBayes (v0.9.14-18-g36789d8-dirty) to call for SNPs from a merged BAM file of 16 diploid plant samples. In my case, I would like FreeBayes to consider only variants that are supported by at least 5 alternate allele observations in a single sample (--min-alternate-count 5) and also by at least 20% of the reads from a single sample (--min-alternate-fraction 0.2). I was wondering what does FreeBayes do to the variants that did not meet the --min-alternate-count and --min-alternate-fraction criteria?

Command

freebayes \
  -b BWT2_16Samples.merged.bam \
  -f chr00.fasta \
  -v BWT2_16Samples.merged.vcf \
  --ploidy 2 \
  --min-alternate-count 5 \
  --min-alternate-fraction 0.2 \
  --no-population-priors \
  --min-mapping-quality 0

VCF output

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT s1.sorted s2.sorted s3.sorted s4.sorted s5.sorted s6.sorted s7.sorted s8.sorted s9.sorted s10.sorted s11.sorted s12.sorted s13.sorted s14.sorted s15.sorted s16.sorted
chr00   822     .       T       A       878.663 .   AB=0.857143;ABP=10.7656;AC=23;AF=0.884615;AN=26;AO=28;CIGAR=1X;DP=31;DPB=31;DPRA=2.5;EPP=14.1779;EPPR=3.0103;GTI=0;LEN=1;MEANALT=1.08333;MQM=2.89286;MQMR=3;NS=13;NUMALT=1;ODDS=0.312408;PAIRED=0.964286;PAIREDR=0.5;PAO=0;PQA=0;PQR=0;PRO=0;QA=1015;QR=69;RO=2;RPP=7.97367;RPPR=7.35324;RUN=1;SAF=22;SAP=22.8638;SAR=6;SRF=1;SRP=3.0103;SRR=1;TYPE=snp   GT:DP:RO:QR:AO:QA:GL    .       1/1:1:0:0:1:39:-3.9,-0.30103,0  1/1:3:0:0:2:75:-6.77333,-0.60206,0      1/1:2:0:0:2:71:-6.745,-0.60206,0        .       1/1:1:0:0:1:41:-4.1,-0.30103,0       .       0/1:7:1:38:6:201:-10,0,-2.53789 1/1:1:0:0:1:34:-3.4,-0.30103,0  1/1:1:0:0:1:38:-3.8,-0.30103,0  1/1:2:0:0:2:76:-7.22,-0.60206,0 0/0:1:1:31:0:0:0,-0.30103,-3.1       1/1:4:0:0:4:150:-10,-1.20412,0  1/1:3:0:0:3:107:-9.98667,-0.90309,0     1/1:2:0:0:2:69:-6.555,-0.60206,0        1/1:3:0:0:3:114:-10,-0.90309,0

From the VCF output shown above, it seems that the only sample that fulfilled the --min-alternate-count criteria is s8.sorted as there are >=5 alternate allele counts. Other samples like s2.sorted and s3.sorted have only 1 and 2 alternate allele counts respectively and are still being printed in the output, shouldn't they get filtered away?

Thanks for your help in advance.

freebayes SNP • 5.4k views
ADD COMMENT
1
Entering edit mode
9.3 years ago
SNPsaurus ▴ 50

The min-alternate-count flag is used to set a threshold for which an allele is evaluated in the population. So as long as --min-alternate-total samples meet the threshold then the allele is used at that site. It looks like you want to filter away individual genotype calls that don't meet the 5 depth threshold. You can do that after the vcf is made with vcffilter -g "DP > 5" for example.

ADD COMMENT

Login before adding your answer.

Traffic: 1630 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6