Question

Annovar Annotation Dificulties

0

Entering edit mode

7.4 years ago

hdtms ▴ 20

Hi

I split my inicial Indel vcf by sample and then converted each Indel vcf into the Annovar input file , I got several sequences thrown out because of ORF errors. So my first doubt is about this is it because of the conversion of the files or errors already in my file. Next I filtered using the Gene Base Annotation and compared my number of occurences for each sample and found out they were the same for all samples and as I wanted to compare the variants between samples thid is a problem , is there a reason?

annotation Indels • 3.1k views

ADD COMMENT • link updated 7.4 years ago by Biostar 20 • written 7.4 years ago by hdtms ▴ 20

0

Entering edit mode

Can you show some of the calls that were thrown out?; also which command did you use, exactly (to convert from VCF to Annovar format)?

ADD REPLY • link 7.4 years ago by Kevin Blighe 89k

0

Entering edit mode

I do not know how I can see which calls were thrown out but what I get in the Log file is this:

NOTICE: Done with 63481 transcripts (including 15216 without coding sequence annotation) for 27720 unique genes NOTICE: Processing next batch with 15922 unique variants in 15922 input lines WARNING: A total of 405 sequences will be ignored due to lack of correct ORF annotation

And the command I used to convert from VCF to Annovar format was:

annotate_variation.pl --geneanno -build hg19 -out Sample1 -build hg19 -dbtype refGene Sample1.avinput humandb/

ADD REPLY • link 7.4 years ago by hdtms ▴ 20

0

Entering edit mode

okay, but how do you execute the convert2annovar.pl script, i.e., prior to executing annotate_variation.pl?

ADD REPLY • link 7.4 years ago by Kevin Blighe 89k

0

Entering edit mode

You have not responded, but you should also normalise your VCFs prior to converting them to Annovar format by left-aligning indels and splitting multi-allelic calls via bcftools norm -m-any

Note also that there is a bug in Annovar wen converting from VCF to Annovar format, of which I am aware. After you try to annotate, variants that cannot be interpreted will be output to a file with extension something like .invalid_input. Check them there and then paste them here, if you wish.

Also, take a look at another answer that I posted here: A: Difference between gene-based, region-based and filter-based annotation in ANNOV

ADD REPLY • link 7.4 years ago by Kevin Blighe 89k

0

Entering edit mode

Soory for answering so late. I do the convertion with the following command: convert2annovar.pl -format vcf4 Am01.vcf -outfile Am01.avinput

And I don't get any .invalid_input files

ADD REPLY • link 7.3 years ago by hdtms ▴ 20

0

Entering edit mode

No no, Is there an .invalid_input file with entries after you annotate your data.

ADD REPLY • link 7.3 years ago by Kevin Blighe 89k