Database Of Essential Genes
3
9
Entering edit mode
12.6 years ago
pegahtv ▴ 140

I was searching for the list of essential genes in human. I found the http://www.essentialgene.org/ which has 118 genes for human in its database. However, I am sure that the number of essential genes in human is much more that 118. Does anyone knows another database of essential genes? Also if you know any for organisms other than human, I would b e happy to hear about. thanks.

genes database • 12k views
ADD COMMENT
6
Entering edit mode

It's worth pointing out that in general, we can only infer which genes are essential for humans by comparative genomics with other model organisms. For obvious ethical reasons: you can't knock out a gene in a human subject to see whether the effect is lethal :)

ADD REPLY
1
Entering edit mode

Hi Neilfws ... is it possible to see this effect using simulation .. i mean knocking out a gene and see the effect?

ADD REPLY
0
Entering edit mode

I suppose you could make inferences, knowing something about human metabolism and looking at the metabolic networks in which genes are involved. I'd imagine, for instance, that a functional cytochrome oxidase complex is essential for humans. However, it would still be inference, not observed experimental "fact".

ADD REPLY
1
Entering edit mode

Some interesting modelling papers online! Just been having a quick search - http://bioinformatics.oxfordjournals.org/content/26/4/536.full and http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.0020072 for example.

ADD REPLY
0
Entering edit mode

Thanks a lot ... the paper helps ...

ADD REPLY
1
Entering edit mode

True, genes cannot be knocked out in humans, but 1000G data has identified many loss-of-function variants (see my response below) that can allow us to uncover the roles of these genes, once deep phenotyping has been done, and whether they are "essential" or not.

ADD REPLY
0
Entering edit mode

That's an interesting idea, but note that you can only detect non-essential genes the way, the opposite.

ADD REPLY
0
Entering edit mode

I know. If enough of this analysis were undertaken, one could arrive at a list of (highly confident) essential genes - those, eg, never seen in an LOF analysis.

ADD REPLY
0
Entering edit mode

very very interesting discussions. I really learned alot. and @Steve Moss thanks for the link to the paper.

ADD REPLY
9
Entering edit mode
12.6 years ago
Gjain 5.8k

Hi, Here is some other references for Human, Worm and Arabidopsis.

  • CEG (Cluster of Essential Genes) is a database containing clusters of orthologous essential genes developed by CEFG Group in UESTC. Original data for generating CEG are derived from the database DEG, which has been published in NAR in 2004 and 2009. Different from DEG, CEG database store essential genes in the form of orthologous groups and not in single genes. The current version contains 16 species:

    • Bacillus subtilis 168
    • Staphylococcus aureus N315
    • Vibrio cholerae N16961
    • Escherichia coli MG1655
    • Haemophilus influenzae Rd KW20
    • Mycoplasma genitalium G37
    • Streptococcus pneumoniae
    • Helicobacter pylori 26695
    • Mycobacterium tuberculosis H37Rv
    • Salmonella typhimurium LT2
    • Francisella novicida U112
    • Acinetobacter baylyi ADP1
    • Mycoplasma pulmonis UAB CTIP
    • Pseudomonas aeruginosa UCBPP-PA14
    • Salmonella enterica serovar Typhi
    • Staphylococcus aureus NCTC 8325
  • Understanding the biology of C. elegans relies on identification and analysis of essential genes, genes required for growth to a fertile adult. Approaches for identifying essential genes include several types of classical forward genetic screens, genome-wide RNA interference screens and systematic targeted gene knockout. Based on most estimates made from screening results thus far, from 15–30% of C. elegans genes appear to be essential. Genetic redundancy masks some essential functions and pleiotropy of many essential genes poses a challenge for a full understanding of their functions. Temperature sensitive mutations are valuable tools for studies of essential genes, but our ability to analyze essential genes would benefit from development of new tools for conditional inactivation or activation of specific genes.

  • Essential Genes in Arabidopsis Seed Development : This project deals with genes that exhibit a seed phenotype when disrupted by a loss-of-function mutation. The updated database (December, 2010) includes 481 genes and 888 mutants. More than 60% of these mutants have been analyzed in the Meinke laboratory at Oklahoma State University. Recent additions not included in the database are listed at the Supplemental Gene Dataset link on the Access Page.

ADD COMMENT
7
Entering edit mode
12.6 years ago

Genes subject to LOF (loss of function) may allow you to infer genes that are not necessary to reach adulthood. The 1000 Genomes project has allowed LOF genes to be found. Unfortunately, there is little phenotypic information about the 1000G subjects. Perhaps what appears as a healthy adult with a given gene deleted or without function at both copies, indeed has poor vision or poor sperm quality - both minor phenotypes that would not seem out of the ordinary in a population of individuals (many people wear glasses or are childless, for example). This gets at the question of what is truly essential.

So, while I have not given you a source where you can find a given list of genes, I think the LOF genes from 1000G provides some material for real thought on this topic.

ADD COMMENT
0
Entering edit mode

This is very interesting. I have to read more papers on it now. Thanks for the information.

ADD REPLY
0
Entering edit mode

Thanks for the information. Very interesting. I will read about it.

ADD REPLY
5
Entering edit mode
12.6 years ago

Have you checked out the CEGMA pipeline from Ian Korf's lab at UC Davis?

It is more of a tool for testing the "completeness" of genome sequencing projects, but does so by testing for the presence of core genes. There is a cool paper about the pipeline here, and a subsequent study it was used on here and here. They identify 458 core proteins across a wide range of taxa, of which 248 are the most highly conserved and can thus be found in even lower coverage (~2X) genomes.

There is also the COG (Clusters of Orthologous Groups of proteins) database at NCBI (also containing the Eukaryote specific clusters or KOGs - which were used as the basis for the CEGMA development, along with some COGs). Papers for the COG database are here:

Check out the Conserved Domain Database also at NCBI and the KOG browser at JGI.

ADD COMMENT

Login before adding your answer.

Traffic: 1647 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6