How To Understand Some Annotated Terms Made By Annovar
1
0
Entering edit mode
10.8 years ago
zengtony743 ▴ 80

Hi, I have question about annotation by Annovar. It is a little bit confusing for me now when I need to identify variants which are potential to change the function of gene or protein.

Annovar produces one table exonic_variant_function table and variant_function table. The first table lists all variants that are located in exonic coding region which could change gene function but second table lists all variants that are located in region close to exon or exon/intron boundary.

In the second variant annotation table made by annovar software. it produces variants which have been called as the "exonic" and they are explained as " here refers only to coding exonic portion" or "variant overlaps a coding exon ". Why these variants are made in Second table but not exonic variant function table??? what does "exonic" here different from "silent mutation"?

Also, "splicing" is explained as variant that is within 2-bp away from an exon/intron boundary by default and "exonic splicing" is explained as a variant within exon but close to exon/intron boundary. Is that mean exonic splicing has low possibility to induce transcript alternative comparing with "splicing"? Not very clear here.

It is so hard to neglect these "exonic" or "exonic splicing" variants or not without no worry about losing real mutation.

annovar splicing • 5.8k views
ADD COMMENT
3
Entering edit mode
10.8 years ago

The file "variant_function" contains annotated variants based on genomic features they are part of. For example, exonic, intronic, UTRs, up/downstream and intergenic. This file will contain all the variants. The file "exonic_variant_function" contains exonic variants from the first file and have been annotated according to their effect. For example, synonymous, nonsynonymous, frameshift etc. This will contain all the exonic variants.

You may be working with exome sequencing data and thus may not be seeing variants from intergenic regions and thats why your wrote "but second table lists all variants that are located in region close to exon or exon/intron boundary". But in case of whole genome sequencing the "variant function" file will be pretty large in comparison to the "exonic_variant_file".

I don't get what you mean by "what does "exonic" here different from "silent mutation" ?". But all the variants from the "exonic_variant_function" should be present in "variant_function" file.

The "splicing" thing according to author is debatable. He has explained it as ""splicing" in ANNOVAR is defined as variant that is within 2-bp away from an exon/intron boundary by default, but the threshold can be changed by the --splicing_threshold argument. Before Feb 2013, if "exonic,splicing" is shown, it means that this is a variant within exon but close to exon/intron boundary; this behavior is due to historical reason, when a user requested that exonic variants near splicing sites be annotated with splicing as well. However, I continue to get user emails complaining about this behavior despite my best efforts to put explanation in the ANNOVAR website with details. Therefore, starting from Feb 2013 , "splicing" only refers to the 2bp in the intron that is close to an exon, and if you want to have the same behavior as before, add -exonicsplicing argument." (From Annovar website).

I would recommend you to try both -splicing and -exonicsplicing.

ADD COMMENT
0
Entering edit mode

Ashuto, Thank you for your explain, that's really helpful :)

ADD REPLY

Login before adding your answer.

Traffic: 2804 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6