Entering edit mode
6.9 years ago
hdtms
▴
20
Hi
I split my inicial Indel vcf by sample and then converted each Indel vcf into the Annovar input file , I got several sequences thrown out because of ORF errors. So my first doubt is about this is it because of the conversion of the files or errors already in my file. Next I filtered using the Gene Base Annotation and compared my number of occurences for each sample and found out they were the same for all samples and as I wanted to compare the variants between samples thid is a problem , is there a reason?
Can you show some of the calls that were thrown out?; also which command did you use, exactly (to convert from VCF to Annovar format)?
I do not know how I can see which calls were thrown out but what I get in the Log file is this:
NOTICE: Done with 63481 transcripts (including 15216 without coding sequence annotation) for 27720 unique genes NOTICE: Processing next batch with 15922 unique variants in 15922 input lines WARNING: A total of 405 sequences will be ignored due to lack of correct ORF annotation
And the command I used to convert from VCF to Annovar format was:
okay, but how do you execute the
convert2annovar.pl
script, i.e., prior to executingannotate_variation.pl
?You have not responded, but you should also normalise your VCFs prior to converting them to Annovar format by left-aligning indels and splitting multi-allelic calls via bcftools norm -m-any
Note also that there is a bug in Annovar wen converting from VCF to Annovar format, of which I am aware. After you try to annotate, variants that cannot be interpreted will be output to a file with extension something like .invalid_input. Check them there and then paste them here, if you wish.
Also, take a look at another answer that I posted here: A: Difference between gene-based, region-based and filter-based annotation in ANNOV
Soory for answering so late. I do the convertion with the following command: convert2annovar.pl -format vcf4 Am01.vcf -outfile Am01.avinput
And I don't get any .invalid_input files
No no, Is there an .invalid_input file with entries after you annotate your data.