Entering edit mode
8.4 years ago
bioinforesearchquestions
▴
370
Hi all,
Is there a way to annotate existing VCF file with known disease-causing mutations?
Hi all,
Is there a way to annotate existing VCF file with known disease-causing mutations?
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Variant annotation pipelines (snpEff/Annovar) do this exact thing - annotate VCF files by looking at standard annotation files, such as dbSNP or ClinVar data.
We'd be in a better position if you told you where exactly you face the difficulty.
Dear Ram,
Thanks for your answer!! As you mentioned, the following command annotates vcf "java -jar SnpSift.jar annotate dbSnp132.vcf variants.vcf > variants_annotated.vcf ". So I need to download vcf for dbsnp, 1000Genome, clinvar..etc from their respective sources. Also, can I annotate dbsnp,clinvar together in the same command or in two different commands?
Dear Ram, if possible, can you look into this query "Filtering multisample VCF based on genotype using SnpSift filter"
Yes, you will need to download those files - I am not sure if streaming them will work. Also, I recall GATK's page referring to these as resources, any number of which can be used in a single command, so I'd recommend reading GATK's manual on this.
https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_annotator_VariantAnnotator.php
Also, take a look at Denise's VEP pointers.
You can use the Variant Effect Predictor (aka VEP) to annotate your VCF. It will map your variants to known disease causing mutations (if there are any) from COSMIC for example, or genetic variants from dbSNP146 (for release 84. Ensembl 85 will have dbSNP 147 for human according to the Ensembl blog post. It will also give you the clinical significance from ClinVar. There is a VEP script if you know Perl but if you rather use another language than Perl have a look at the variation endpoints in the Ensembl REST API.