Entering edit mode
20 months ago
ttom
▴
230
I am trying to annotate a vcf using annovar using the following command
perl /annovar/table_annovar.pl chr21.vcf.gz annovar/humandb/ -buildver hg38 -out chr21 -remove -protocol refGene,ensGene,esp6500siv2_aa,esp6500siv2_ea,esp6500siv2_all -operation g,g,r,r,r -nastring . -vcfinput --nopolish
I am not getting the output in VCF format, not understanding the error.
Error and log
NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg38 -dbtype refGene -outfile chr21.refGene -exonsort -nofirstcodondel chr21.avinput annovar/humandb/>
NOTICE: Output files are written to chr21.refGene.variant_function, chr21.refGene.exonic_variant_function
NOTICE: Reading gene annotation from annovar/humandb/hg38_refGene.txt ... Done with 88819 transcripts (including 21511 without coding sequence annotation) for 28307 unique genes
NOTICE: Processing next batch with 359444 unique variants in 359444 input lines
NOTICE: Reading FASTA sequences from annovar/humandb/hg38_refGeneMrna.fa ... Done with 647 sequences
WARNING: A total of 606 sequences will be ignored due to lack of correct ORF annotation
-----------------------------------------------------------------
NOTICE: Processing operation=g protocol=ensGene
NOTICE: Running with system command <annotate_variation.pl -geneanno -buildver hg38 -dbtype ensGene -outfile chr21.ensGene -exonsort -nofirstcodondel chr21.avinput annovar/humandb/>
NOTICE: Output files are written to chr21.ensGene.variant_function, chr21.ensGene.exonic_variant_function
NOTICE: Reading gene annotation from annovar/humandb/hg38_ensGene.txt ... Done with 111108 transcripts (including 39529 without coding sequence annotation) for 47298 unique genes
NOTICE: Processing next batch with 359444 unique variants in 359444 input lines
NOTICE: Reading FASTA sequences from annovar/humandb/hg38_ensGeneMrna.fa ... Done with 626 sequences
WARNING: A total of 415 sequences will be ignored due to lack of correct ORF annotation
-----------------------------------------------------------------
NOTICE: Processing operation=r protocol=esp6500siv2_aa
NOTICE: Running with system command <annotate_variation.pl -regionanno -dbtype esp6500siv2_aa -buildver hg38 -outfile chr21 chr21.avinput annovar/humandb/>
NOTICE: Output file is written to chr21.hg38_esp6500siv2_aa
NOTICE: Reading annotation database annovar/humandb/hg38_esp6500siv2_aa.txt ... Error: invalid record found in region annotation database: <1 69428 69428 T G 0.0037 rs140739101>
Error running system command: <annotate_variation.pl -regionanno -dbtype esp6500siv2_aa -buildver hg38 -outfile chr21 chr21.avinput annovar/humandb/>
Error running system command: <annovar/table_annovar.pl chr21.avinput annovar/humandb/ -buildver hg38 -outfile chr21 -remove -protocol refGene,ensGene,esp6500siv2_aa,esp6500siv2_ea,esp6500siv2_all -operation g,g,r,r,r -nastring . --nopolish -otherinfo>
`
The first error says:
Are you using the same chromosome format for your input VCF and the annotation database? F.e, make sure that if your VCF says
chr1
the annotation database also useschr1
format instead of1
.