Fill in missing VCF information with bcftools or vcftools or Vcf.pm
1
1
Entering edit mode
10.1 years ago
Lee Katz ★ 3.2k

Hi Biostars, I was wondering how I might backfill some information into the VCF format. Here are the first three lines of a VCF pooled file I created with bcftools merge.

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1 2:Sample1 3:Sample1 4:Sample1 5:Sample1 6:Sample1 7:Sample
NC001416 1 . G . 0 PASS ADP=3;WT=0;HET=0;HOM=0;NC=1 GT:GQ:SDP:DP:RD:AD:RBQ:ABQ:RDF:RDR:ADF:ADR ./.:.:3:.:.:.:.:.:.:.:.:
NC001416 2 . G . 0 PASS ADP=3;WT=0;HET=0;HOM=0;NC=1 GT:GQ:SDP:DP:RD:AD:RBQ:ABQ:RDF:RDR:ADF:ADR ./.:.:3:.:.:.:.:.:.:.:.:
NC001416 3 . G . 0 PASS ADP=6;WT=0;HET=0;HOM=0;NC=1 GT:GQ:SDP:DP:RD:AD:RBQ:ABQ:RDF:RDR:ADF:ADR ./.:.:6:.:.:.:.:.:.:.:.:

One thing I would want to backfill is the DP attribute for each sample. Would I use bcftools annotate? How? Is there an example or tutorial out there I could follow? I have all the original bam files, so I can supply that information to a given script.

vcftools annotation vcf bcftools Vcf.pm • 3.8k views
ADD COMMENT
0
Entering edit mode

Have you tried the Squaring off utility of bcbio.variation.recall?

ADD REPLY
0
Entering edit mode

I think that bcbio.variation.recall does the same thing that bcftools merge does. Right? I am not sure if I can use it to backfill information such as DP (depth).

ADD REPLY
0
Entering edit mode
10.1 years ago
Lee Katz ★ 3.2k

This seems like at least a partial answer that I've worked out, using Vcf.pm. I set the depth to 5x on every line in this example.

perl -MVcf -MData::Dumper -e '$vcf=Vcf->new(file=>"sample2.fastq.gz-lambda_virus.vcf.gz"); $vcf->parse_header();while(my $x=$vcf->next_data_hash()){$$x{gtypes}{Sample1}{DP}=5; print $vcf->format_line($x);}'| head
ADD COMMENT

Login before adding your answer.

Traffic: 1837 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6