Hi,
I followed the Annovar tutorial with the default dataset (avsnp147, ExAC and dbnsfp30a). The tutorial can be found here: https://annovar.openbioinformatics.org/en/latest/user-guide/startup/
The resulting vcf contained all the expected format and data, including CADD scores. Then, I decided to repeat this using gnomad211_exome,avsnp150, and dbnsfp42c datasets instead of those above, but the resulting vcf file contains all the annotations expected except the CADD scores. These datasets were downloaded using the Annovar guidelines.
The header of the vcf doesn't even include the following:
##INFO=<ID=CADD_raw,Number=.,Type=Float,Description="CADD_raw annotation provided by ANNOVAR">
##INFO=<ID=CADD_phred,Number=.,Type=Float,Description="CADD_phred annotation provided by ANNOVAR">
Can someone tell me why is this happening? Do any of the datasets used in the second case not include CADD scores?
Below is the command I used:
perl ./annovar/table_annovar.pl \
in.vcf \
humandb/ \
-buildver hg19 \
-out myanno.Equal \
-remove \
-protocol refGene,cytoBand,gnomad211_exome,avsnp150,dbnsfp42c \
-operation g,r,f,f,f \
-nastring . \
-vcfinput \
-polish
Thanks in advance.
Really great answer. I thought one of the databases didnt include the CADD score but didnt know which one was the problem.
Thanks a lot.
No problem. The idea that CADD could come from dbNSFP is also a thing from experience. Anyway, please accept my answer to provide closure to the question.