gene based annotation : Database or live computation ?
0
0
Entering edit mode
7.8 years ago
sacha ★ 2.4k

There are several annotation database for annotation variants. Like dbNSFP, dbSNP, cosmic.... I was looking for a gene based annotation database which tell me the effect of variant . For exemple : Intron, exon, splice_site_donor, missens ... But I didn't find any database like that. Those fields depends on gene/transcript database, like refGene, UCSC gene, encode ... And it will generate huge database it we try to store each possibility .

So I assume annotator like UCSC, VEP or SnpEff compute those fields during the annotation process. Something like :

   def consequence(variant) : 
          for gene in refgene:
                  if variant in gene: 
                       if variant in gene.exons:
                           return "exons";
                      if variant in gene.introns:
                           return "introns"

So.. What's the strategy to make gene annotation with those fields. Database or live computation ?

annotation • 1.5k views
ADD COMMENT
0
Entering edit mode

I rather doubt there's a for gene in refgene sort of loop. More likely, the variant region is flanked by some reasonable amount and then that region queried in an interval tree or similar structure. The results can then be iterated over. Otherwise things would get really slow.

ADD REPLY
0
Entering edit mode

Thx for your reply. That was an example . My question is whether it use a database or a computed methods?

ADD REPLY
1
Entering edit mode

At least for snpEff, the methods section mentions the following:

This can be performed once the user has downloaded or built the database. The program loads the binary database and builds a data structure called “interval forest” in order to perform an efficient interval search. Input files are parsed and each variant queries the data structures to find intersecting genomic annotations. All intersecting genomic regions are reported and whenever these regions include an exon, the coding effect of the variant is calculated.

That indicates to me that it's doing the actual annotation live.

ADD REPLY

Login before adding your answer.

Traffic: 1861 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6