Difference between gene-based, region-based and filter-based annotation in ANNOVAR
1
1
Entering edit mode
6.7 years ago
vivekruhela ▴ 20

Hi everyone. I am working on NGS pipeline and I am going to use ANNOVAR for annotation of variants. But there are three types of annotations in ANNOVAR i.e. gene-based, region based, and filter based. Although I have checked their definition but I am not sure which one is better to use. I have also check one example in ANNOVAR documentation where they have called many databases and use different annotation scheme for different databases. Why?

Can anybody elaborate me the difference, significance and how to use those schemes?

Thanks.

next-gen R sequence gene annotation • 2.9k views
ADD COMMENT
2
Entering edit mode
6.7 years ago

Ignore the names for now and just decide which annotations you want, and then figure out whether they relate to region-, gene-, or filter-based annotations. You can then annotate with multiple different types concurrently. Kai Wang's documentation on the Annovar website is pretty comprehensive in fact, compare to other programs.

Here is code that I re-use a lot, including the code that I use to download the databases. The eventual type of annotation is specified with the -protocol and -operation parameters

#Build the annovar databases
#RefSeq genes
perl /Programs/annovar/annotate_variation.pl -buildver hg19 -downdb 
-webfrom annovar refGene /Programs/annovar/humandb/ ;


#Cytobanding info
perl /Programs/annovar/annotate_variation.pl -buildver hg19 
-downdb cytoBand /Programs/annovar/humandb/ ;


#NA
perl /Programs/annovar/annotate_variation.pl -buildver hg19 
-downdb genomicSuperDups /Programs/annovar/humandb/ ;


#Allele frequencies
    #NHLBI-ESP variant frequencies
    perl /Programs/annovar/annotate_variation.pl -buildver hg19 
-downdb -webfrom annovar esp6500siv2_all /Programs/annovar/humandb/ ;


    #1000 Genomes allele frequencies
    perl /Programs/annovar/annotate_variation.pl -buildver hg19 
-downdb -webfrom annovar 1000g2015aug /Programs/annovar/humandb/ ;

    #perl /Programs/annovar/annotate_variation.pl -buildver hg19 
-downdb -webfrom annovar 1000g2014oct /Programs/annovar/humandb/ ;


    #ExAC allele frequencies
    perl /Programs/annovar/annotate_variation.pl -buildver hg19 
-downdb -webfrom annovar exac03 /Programs/annovar/humandb/ ;
    #Great Middle East
    perl /Programs/annovar/annotate_variation.pl -buildver hg19 
-downdb -webfrom annovar gme /Programs/annovar/humandb/ ;


#dbSNP
perl /Programs/annovar/annotate_variation.pl -buildver hg19 
-downdb -webfrom annovar snp138 /Programs/annovar/humandb/ ;


#dbSNP with allelic splitting and left-normalisation
perl /Programs/annovar/annotate_variation.pl -buildver hg19 
-downdb -webfrom annovar avsnp147 /Programs/annovar/humandb/ ;


#SIFT, PolyPhen, and other scores
#perl /Programs/annovar/annotate_variation.pl -buildver hg19 
-downdb -webfrom annovar ljb26_all /Programs/annovar/humandb/ ;

perl /Programs/annovar/annotate_variation.pl -buildver hg19 
-downdb -webfrom annovar dbnsfp30a /Programs/annovar/humandb/ ;


#COSMIC
perl /Programs/annovar/annotate_variation.pl -buildver hg19 
-downdb -webfrom annovar cosmic70 /Programs/annovar/humandb/ ;


#ClinVar
perl /Programs/annovar/annotate_variation.pl -buildver hg19 
-downdb -webfrom annovar clinvar_20161128 /Programs/annovar/humandb/ ;


#Annotate with RefSeq genes, cytoband, dbSNP147, et cetera
#   Usage: table_annovar.pl [arguments] <query-file> <database-location>
#   --protocol <string>, comma-delimited string specifying database protocol
#   --operation <string>,comma-delimited string specifying type of operation
#   --outfile <string>, output file name prefix
#   --buildver <string>, genome build version (default: hg18)
#   --remove, remove all temporary files
#   --otherinfo, print out otherinfo (infomration after fifth column in queryfile)
#   --onetranscript, print out only one transcript for exonic variants (default: all transcripts)
#   --nastring <string>, string to display when a score is not available (default: null)
#   --csvout, generate comma-delimited CSV file (default: tab-delimited txt file)
perl /Programs/annovar/table_annovar.pl MyVariants.ann /Programs/annovar/humandb/ 
  -buildver hg19 -remove -otherinfo 
  -protocol refGene,cytoBand,gme,esp6500siv2_all,exac03,dbnsfp30a,avsnp147,cosmic70,clinvar_20161128 
  -operation g,r,f,f,f,f,f,f,f -nastring "NA" -csvout ;
ADD COMMENT
1
Entering edit mode

Thanks sir. I'll read the documentation again. Thanks for providing a good example. Whatever the databased you have invoked through ANNOVAR, are they for WES data or something else. How to determine functional prediction of mutation individually. (other than ANNOVAR way, because I didn't find separate repository of metaSVM or metaLR for functional prediction. Same question for determination of significant somatic mutations.

Thanks again.

ADD REPLY

Login before adding your answer.

Traffic: 1542 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6