Is There A Database Of Known Dna Binding Proteins?
7
4
Entering edit mode
12.7 years ago

Hi,

Is anyone aware of a database of DNA binding proteins?

D.

dna protein database • 7.1k views
ADD COMMENT
4
Entering edit mode
12.7 years ago
Gareth Palidwor ★ 1.6k

DNA binding is identified by the Gene Ontology annotations GO:0003677 Many protein databases have GO annotations that you can filter by.

ADD COMMENT
2
Entering edit mode
12.7 years ago
Chris ★ 1.6k

Swiss-Prot, or PDB if you need structures.

ADD COMMENT
1
Entering edit mode
12.6 years ago

My source is from the FANTOM consortium. They list some 2000 TFs for human. See Table S1, a list of human TFs, in Ravasi, et al. (2010 Cell 140: 744-752) that describe transcription factors. From the website: FANTOM has developed and expanded over time to encompass the fields of transcriptome analysis. The object of the project is moving steadily up the layers in the system of life, progressing thus from an understanding of the ‘elements’ - the transcripts - to an understanding of the ‘system’ - the transcriptional regulatory network.

ADD COMMENT
0
Entering edit mode
12.7 years ago
razor ▴ 190

Do you also need the possible sites they recognize? Also, do you want all available sequences or only one given species?

ADD COMMENT
0
Entering edit mode

In the future, it is good practice to place questions of the original poster in the comments section after the question as opposed to in an answer as you have done.

ADD REPLY
0
Entering edit mode
12.6 years ago
Tyler Davis ▴ 20

ProteinLounge offers a feature called the Protein Interaction Database(http://www.proteinlounge.com/Database/Databases.aspx) which lists binding sites for some proteins.

ADD COMMENT
0
Entering edit mode
12.6 years ago

Transfac is pretty good - and not necessarily covered by UniProt:

http://www.gene-regulation.com/pub/databases.html

For yeast the yeast trac database is good, capturing Chip-Seq data:

http://www.yeastract.com/

ADD COMMENT
0
Entering edit mode
7.0 years ago
moldach ▴ 130

DNABP is a database/manuscript, from late 2016, that built a machine learning method (Random Forest) to identify de-novo DNA-binding proteins using only sequence information: 1) the conservation of physiochemical protperties of the amino acids, and 2) the binding propensity of DNA-binding residues.

They divided 14,262 proteins from Uniprot for which they were confident if it was DNA-binding or non-DNA-binding and used this as their training data set; you can download this information from the supplement S1. You can also get DNA-binding and non-binding Uniprot accessions they used for their test set of their model from the supplements. Although the method achieved high accuracy (~83-90%) the web server system can only accept a single sequence at a time so it's not really suited for classifying a large number of de-novo DNA-binding proteins.

If anyone knows of a better/more-comprehensive resource available today I'd be happy if they could share it.

ADD COMMENT

Login before adding your answer.

Traffic: 2463 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6