Hello,
I am trying to annotate SVs that I got by running NanoVar using Annovar (build hg38). Here are the commands I used to run Annovar.
annotate_variation.pl -buildver hg38 -downdb -webfrom annovar refGene humandb/
annotate_variation.pl -buildver hg38 -downdb cytoBand humandb/
annotate_variation.pl -buildver hg38 -downdb -webfrom annovar exac03 humandb/
annotate_variation.pl -buildver hg38 -downdb -webfrom annovar avsnp147 humandb/
annotate_variation.pl -buildver hg38 -downdb -webfrom annovar dbnsfp30a humandb/
table_annovar.pl /scratch/pipe_try/nanovar/trimmed-hg38.pass.nanovar.vcf /scratch/RESOURCES/hg38/annovar_humandb -buildver hg38 -out /scratch/pipe_try/annotation/annotation.hg38.vcf -remove -protocol refGene,cytoBand,exac03,avsnp147,dbnsfp30a -operation gx,r,f,f,f -nastring . -vcfinput -polish -xref gene_fullxref.txt
Although annotations from xref files are populating in the final annotated vcf, the ExacALL scores are just "." through out. Here is a sample of the vcf input (minus the headers) to Annovar:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT trimmed
chr1 350186 nv_SV18238-03RC4 N <INV> 4.2 PASS
SVTYPE=INV;END=248942250;SVLEN=248592064;SR=2;NN=0.622 GT:DP:AD 1/1:2:0,2
chr1 475602 nv_SV10120-J3V6N N ]chr1:710590]N 2.6 PASS
SVTYPE=BND;END=.;SVLEN=.;SR=2;NN=0.448;SV2=TLO GT:DP:AD 0/1:4:2,2
chr1 710580 nv_SV11577-6U3XJ N <INS> 6.9 PASS
SVTYPE=INS;END=710581;SVLEN=285;SR=3;NN=0.796 GT:DP:AD 1/1:3:0,3
chr1 710590 nv_SV10120-J3V6N N N[chr1:475602[ 2.6 PASS
SVTYPE=BND;END=.;SVLEN=.;SR=2;NN=0.448;SV2=TLO GT:DP:AD 0/1:4:2,2
chr1 908929 nv_SV18386-NW18S N <DEL> 4.4 PASS
SVTYPE=DEL;END=909323;SVLEN=-393;SR=2;NN=0.639 GT:DP:AD 1/1:2:0,2
chr1 1993702 nv_SV20252-FSNKQ N <DUP> 2.7 PASS
SVTYPE=DUP;END=1993884;SVLEN=182;SR=2;NN=0.462 GT:DP:AD 1/1:2:0,2
chr1 3016310 nv_SV8827-Q7X8T N <DUP> 4.4 PASS
SVTYPE=DUP;END=3016710;SVLEN=400;SR=2;NN=0.637 GT:DP:AD 1/1:2:0,2
chr1 3016310 nv_SV8828-UPN0K N <INS> 3.7 PASS
SVTYPE=INS;END=3016311;SVLEN=606;SR=2;NN=0.572 GT:DP:AD 0/1:3:1,2
chr1 3501983 nv_SV10305-3X4TN N <DUP> 4.0 PASS
SVTYPE=DUP;END=3502114;SVLEN=131;SR=2;NN=0.601 GT:DP:AD 1/1:2:0,2
Here is a portion of the output generated by Annovar:
1 475602 nv_SV10120-J3V6N N ]1:710590]N 2.6 PASS SVTYPE=BND;END=.;SVLEN=.;SR=2;NN=0.448;SV2=TLO;ANNOVAR_DATE=2018-04-16;Func.refGene=intergenic;Gene.refGene=OR4F3\x3bLOC100132287;GeneDetail.refGene=dist\x3d23924\x3bdist\x3d15154;ExonicFunc.refGene=.;AAChange.refGene=.;pLi.refGene=.;pRec.refGene=.;pNull.refGene=.;Gene_full_name.refGene=.;Function_description.refGene=.;Disease_description.refGene=.;Tissue_specificity(Uniprot).refGene=.;Expression(egenetics).refGene=.;Expression(GNF/Atlas).refGene=.;P(HI).refGene=.;P(rec).refGene=.;RVIS.refGene=.;RVIS_percentile.refGene=.;GDI.refGene=.;GDI-Phred.refGene=.;cytoBand=1p36.33;ExAC_ALL=.;ExAC_AFR=.;ExAC_AMR=.;ExAC_EAS=.;ExAC_FIN=.;ExAC_NFE=.;ExAC_OTH=.;ExAC_SAS=.;avsnp147=.;SIFT_score=.;SIFT_pred=.;Polyphen2_HDIV_score=.;Polyphen2_HDIV_pred=.;Polyphen2_HVAR_score=.;Polyphen2_HVAR_pred=.;LRT_score=.;LRT_pred=.;MutationTaster_score=.;MutationTaster_pred=.;MutationAssessor_score=.;MutationAssessor_pred=.;FATHMM_score=.;FATHMM_pred=.;PROVEAN_score=.;PROVEAN_pred=.;VEST3_score=.;CADD_raw=.;CADD_phred=.;DANN_score=.;fathmm-MKL_coding_score=.;fathmm-MKL_coding_pred=.;MetaSVM_score=.;MetaSVM_pred=.;MetaLR_score=.;MetaLR_pred=.;integrated_fitCons_score=.;integrated_confidence_value=.;GERP++_RS=.;phyloP7way_vertebrate=.;phyloP20way_mammalian=.;phastCons7way_vertebrate=.;phastCons20way_mammalian=.;SiPhy_29way_logOdds=.;ALLELE_END GT:DP:AD 0/1:4:2,2
1 3501983 nv_SV10305-3X4TN N <DUP> 4.0 PASS SVTYPE=DUP;END=3502114;SVLEN=131;SR=2;NN=0.601;ANNOVAR_DATE=2018-04-16;Func.refGene=intronic;Gene.refGene=MEGF6;GeneDetail.refGene=.;ExonicFunc.refGene=.;AAChange.refGene=.;pLi.refGene=4.42421031130217e-15;pRec.refGene=0.994620434269407;pNull.refGene=0.00537956573058837;Gene_full_name.refGene=multiple_EGF_like_domains_6;Function_description.refGene=.;Disease_description.refGene=.;Tissue_specificity(Uniprot).refGene=.;Expression(egenetics).refGene=unclassifiable_(Anatomical_System)\x3bmyocardium\x3bheart\x3bovary\x3burinary\x3bcolon\x3bparathyroid\x3bfovea_centralis\x3bchoroid\x3blens\x3bskin\x3bretina\x3bprostate\x3boptic_nerve\x3blung\x3bplacenta\x3bmacula_lutea\x3btestis\x3bkidney\x3bbrain\x3b;Expression(GNF/Atlas).refGene=.;P(HI).refGene=0.11042;P(rec).refGene=0.09881;RVIS.refGene=2.072624965;RVIS_percentile.refGene=97.81198396;GDI.refGene=6730.46373;GDI-Phred.refGene=17.39874;cytoBand=1p36.32;ExAC_ALL=.;ExAC_AFR=.;ExAC_AMR=.;ExAC_EAS=.;ExAC_FIN=.;ExAC_NFE=.;ExAC_OTH=.;ExAC_SAS=.;avsnp147=.;SIFT_score=.;SIFT_pred=.;Polyphen2_HDIV_score=.;Polyphen2_HDIV_pred=.;Polyphen2_HVAR_score=.;Polyphen2_HVAR_pred=.;LRT_score=.;LRT_pred=.;MutationTaster_score=.;MutationTaster_pred=.;MutationAssessor_score=.;MutationAssessor_pred=.;FATHMM_score=.;FATHMM_pred=.;PROVEAN_score=.;PROVEAN_pred=.;VEST3_score=.;CADD_raw=.;CADD_phred=.;DANN_score=.;fathmm-MKL_coding_score=.;fathmm-MKL_coding_pred=.;MetaSVM_score=.;MetaSVM_pred=.;MetaLR_score=.;MetaLR_pred=.;integrated_fitCons_score=.;integrated_confidence_value=.;GERP++_RS=.;phyloP7way_vertebrate=.;phyloP20way_mammalian=.;phastCons7way_vertebrate=.;phastCons20way_mammalian=.;SiPhy_29way_logOdds=.;ALLELE_END GT:DP:AD 1/1:2:0,2
I am wondering why the ExacALL scores do not populate at all. Is there some sort of preprocessing that is required before Annovar is run on NanoVar generated VCF file? Secondly, I noticed NanoVar like other callers such as Sniffles do not give nucleotides for REF and ALT columns instead they those columns substituted by N and <sv_type>. Does this hinder Annovar from annotating properly?
Thank you, Asma