Hi, could somebody point me to a database that will allow me to identify proteins that contain a specific domain (say the PX domain)? I've looked at SMARTS and it seems to do the trick, but I can't see an easy way to download the results.
Hi, could somebody point me to a database that will allow me to identify proteins that contain a specific domain (say the PX domain)? I've looked at SMARTS and it seems to do the trick, but I can't see an easy way to download the results.
I'd use BioMart. For example, once you determine that PX domain has an interpro ID of IPR001683, it's pretty easy to set up the query. You can see how I set it up using this link: http://bit.ly/aM1Ypw
Hi,
It seems that swissprot can be of help here. I did a quick search using 'cytochrome c' and followed the link in the 'domain' section down there to EMBL-EBI and here is what I get:
The page provides me with a list of other proteins containing this domain.
Does this help you? What else could you need?
Cheers
All of the suggestions so far are good. My suggestion is to use UniProt.
Simply type "PX" in the query field at the top of the page. On the results page, you'll see a link that begins: "Restrict term "px" to domain (235)...". Click this and you'll see how to formulate this query in the search form - "domain:px".
The "Download" link, top right of page, offers download in multiple formats: sequences, ID list, XML, delimited, spreadsheet and so on.
At the NCBI searching for PX with eutilities returns 118 records:
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=cdd&term=PX
from one of those ID (e.g. 154983 ) you can retrieve all the related proteins with ELink:
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=cdd&db=protein&id=154983
here, the first protein is http://www.ncbi.nlm.nih.gov/protein/241260146
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks for the pointers everybody. Uniprot seems quite handy and good to see that I'll be able to do this via Bioconductor and BioMart