we do whole exome sequencing at our lab since years, and we already have an established bioinformatics pipeline.
Now we are planning to modify/renew our "annotation" step. Two main points should be done:
- We are planning to annotate the vcf files using Ensembl VEP.
- Additionally, we have several external annotation files, eg: gnomad, LOVD, Cosmic, Clinvar, and other not so known annotations .. ( some are already included in vep cache files)
So, in addition to VEP's annotations, we want to annotate our VCF files with the annotations present in the external annotation files.
I would like to ask, what could be the most efficient way to do this? I know that VEP can accept external annotation files, when having a specific format. Would this be the way to go? or is it more efficient/faster to use external tools (bedtools, vcf tools, ...etc) to match our mutations with the specific annotations in the annotations files.
Also, are there any additional steps that would increase the efficiency of this whole thing? maybe divide the annotations for the external annotations files into a per-chromosome files ?
any input based on your experience in this would be much appreciated.