Dears,
I am searching for a dataset that contain all human proteins with the domains of each protein. (i.e. I want to know the for each protein, the set of domains inside it) Can someone help me with that?
Thanks in advance
Dears,
I am searching for a dataset that contain all human proteins with the domains of each protein. (i.e. I want to know the for each protein, the set of domains inside it) Can someone help me with that?
Thanks in advance
Columns
button in top row. Find family/domains
section and expand it. Select features you need. You can even choose additional databases for familt/domains
(second section). Click Save
to apply your selection.Download
after selecting all proteins (SwissProt, human reviewed, ~20,404 as of Mar 2019 or add TrEMBL, if you want un-reviewed ones also). Choose a format you like. Tab separated would likely work best or plain text.Here are examples of what you should see depending on what columns you select.
Entry Entry name Status Protein names Gene names Organism Length Domain [FT]
Q00604 NDP_HUMAN reviewed Norrin (Norrie disease protein) (X-linked exudative vitreoretinopathy 2 protein) NDP EVR2 Homo sapiens (Human) 133 DOMAIN 39 132 CTCK. {ECO:0000255|PROSITE-ProRule:PRU00039}.
Second example
Entry Entry name Status Protein names Gene names Organism Length Domain [FT] Domain [CC]
Q9HB19 PKHA2_HUMAN reviewed Pleckstrin homology domain-containing family A member 2 (PH domain-containing family A member 2) (Tandem PH domain-containing protein 2) (TAPP-2) PLEKHA2 TAPP2 Homo sapiens (Human) 425 DOMAIN 7 113 PH 1. {ECO:0000255|PROSITE-ProRule:PRU00145}.; DOMAIN 198 298 PH 2. {ECO:0000255|PROSITE-ProRule:PRU00145}.
uniprot https://www.uniprot.org/
see also How To Retrieve Human Proteins Sequence Containing A Given Domain
Try these tools as well: https://pfam.xfam.org/search#tabview=tab0 and http://elm.eu.org/index.html
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thank you for your answer. I am not sure that I explained my need clearly, but I found another resource which has exactly what I need Pfam
I should use this iterative many times as I will have a list of proteins ID (e.g. Uniprot ACC) and I need for each protein name the correspocding domains inside it ( as for the SQL query in the image)
But the question is: Do I need to download the whole database (or even a subset of it ) to be able to run this query ?
Image at : https://ibb.co/jJYCHgP Image Souce: https://pfam.xfam.org/help#tabview=tab12
Please ask your current question to help desk of pfam at https://pfam.xfam.org/help#tabview=tab17
I will do Thank you very much
No, do not do that. Wait if people can help here as genomax now did. Help desks are intended for technical debugging not for guiding users who need tutorials that can be found elsewhere in the web or in bioinformatic communities.