Entering edit mode
11.0 years ago
Adrian Pelin
★
2.6k
Hello,
I mapped my reads to an annotated reference, exported to bam, than used freebayes to call variants. I'm now trying to use vcf-annotated to see how many variants fall within coding regions and what their effect is on the protein.
I ran vcf-annotate as follows:
bgzip Sample1_Genome_noHET_noCo.sort.bed
tabix -s 1 -b 2 -e 3 Sample1_Genome_noHET_noCo.sort.bed.gz
cat GDR-18.sort.grp.bam.vcf | vcf-annotate -a Sample1_Genome_noHET_noCo.sort.bed.gz -d key=INFO,ID=ANN,Number=1,Type=Integer,Description='My custom annotation' -c CHROM,FROM,TO,INFO/ANN > out.vcf
my input vcf looks like this:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT whatever
Sample1_Genome_noHET 989297 . ATAAATTATAAATAAC GTAAATTATAAATAAT,TTAATCTATAAATAAT,ATAAACTATAAATAAT 1906.94 . AB=0.253165,0.316456,0.21519;ABP=44.818,26.1269,58.6715;AC=1,1,1;AF=0.25,0.25,0.25;AN=4;AO=20,25,17;CIGAR=1X14M1X,1X3M2X9M1X,5M1X9M1X;DP=79;DPB=97.9375;DPRA=0,0,0;EPP=6.91895,3.09716,9.26925;EPPR=5.18177;GTI=0;HWE=-0;LEN=16,16,16;MEANALT=4,4,4;MQM=255,255,255;MQMR=255;NS=1;NUMALT=3;ODDS=120.598;PAIRED=1,1,1;PAIREDR=1;PAO=16.5833,3.58333,5.08333;PQA=632.083,138.083,195.083;PQR=297.75;PRO=7.75;QA=768,947,658;QR=631;RO=16;RPP=6.91895,57.2971,9.26925;RPPR=22.5536;RUN=1,1,1;SAP=4.74748,3.09716,6.20364;SRP=11.6962;TYPE=complex,complex,complex;XAI=0.00326384,0.0015,0.00163399;XAM=0.0370321,0.0015,0.0319516;XAS=0.0337682,0,0.0303176;XRI=0;XRM=0.000726744;XRS=0.000726744;technology.illumina=1,1,1;BVAR GT:DP:RO:QR:AO:QA 0/1/2/3:79:16:631:20,25,17:768,947,658
and my .bed looks like this:
#CHR FROM TO ANNOTATION
Sample1_Genome_noHET 101 563 hypothetical protein NCER_102224 CDS 0 +
Sample1_Genome_noHET 1293 1728 hypothetical protein NCER_102355 CDS 0 -
The problem is that the out.vcf file is only 300 bites bigger than the original GDR-18.sort.grp.bam.vcf. And I cannot find any difference between the two. What am I doing wrong and what do I expect to see if it is done right?
Annotation of VCF files has been covered many times on Biostar. If your question is not adequately addressed by threads like the following (or any others you may find by searching), feel free to leave another comment requesting that this be reopened. (annotate a VCF file, Functional annotation of variant calls (vcf files), SNP-annotation and effect prediction, Standard post Variant Call (VCF) analysis that work out of the box)
I suppose I can find alternatives to what I am trying to do.
I know I started my question with a general "question" that has been already addressed, but the rest of my question deal with me trying a specific tool that was not working for me, and was wondering what I was doing wrong.
Fair enough - I'll reopen, and add a couple of edits to make the question more clear.
Oh thank you this looks much better!
Check if the tabix index works, incorrect format of the bed file is the most frequent source of errors. For example, this should print the 101-563 line: