Hi,
I am working on whole exome sequencing data, and to annotate the resulting VCF files, I have been using wANNOVAR. However, we are now going to be receiving around 1000 unannotated VCF files from our collaborator, and the disadvantage of using wANNOVAR is that one can submit only two files at a time from one IP address. I was therefore wondering if I can annotate them offline, and bedtools, as I understand is my answer to it.
My questions are -
What additional bed files will I need to annotate the variants (I am aligning my reads to hg38)?
Where can I find those files?
Is it possible for me to generate the files using hg38.fa? If yes, how?
Thanks
Hi, With that many VCF files to annotate, it would be best to have a local installation of a variant annotation program. I haven't used wANNOVAR, but VEP and SNPEff are extensively documented. If you were to install VEP/ SNPEff, it would be easiest to go through the (bio)conda package respository system that would take care of any dependencies. A good variant annotation program uses numerous sources of datasets to annotate at gene, promoter/enhancer and population-frequency level. Not at all sensible to trying generate such files of your own.
Compared to wANNOVAR(web-based), ANNOVAR (command-line) is a good option to go for..!
Thank you both for the suggestions. I will certainly try these options.