Retrieving a list of protein names in a search result
2
0
Entering edit mode
9.2 years ago
Willem • 0

I'm new here, so if duplicates are around, feel free to redirect.

Imagine that you have searched all proteins of a specific virus and that you want to have a list of all the possible protein name entries of that virus. That way you can see which search terms there are to sort the collection of sequences for the proteins of that virus (since the same type of protein sometimes has a different name in uniprot).

I want to give some code that I tried, but in the end I wouldn't know how to even start. I'm pretty new and so far only know a bit of python.

Thanks!

Uniprotkb • 2.7k views
ADD COMMENT
0
Entering edit mode

Could you let us know what your starting data is. taxid? virus name? dna sequences? uniprotkb ids or acs?

ADD REPLY
0
Entering edit mode

Well, the only thing I have is the virus taxonomy ID.

ADD REPLY
2
Entering edit mode
9.2 years ago

If you have a tax_id, you can use URLs of the type (for tax_id 79692, Human respiratory syncytial virus B)

http://www.uniprot.org/uniprot/?query=taxonomy:79692&format=tab&columns=id,entry%20name,protein%20names

If you don't have the tax_id but an organism name, enter http://www.uniprot.org/taxonomy to find your virus and the corresponding tax_id.

ADD COMMENT
0
Entering edit mode

Thank you, I'll try this method.

ADD REPLY
0
Entering edit mode
9.2 years ago
5heikki 11k

If you have protein GI, then you can use Entrez Direct as such (perhaps you are interested in only some of the "elements" listed below):

epost -db protein -id 2887457 | elink -target gene | efetch -format docsum | xtract -element Name -element Description -element OtherAliases -element OtherDesignations -element NomenclatureSymbol -element NomenclatureName
DHH    desert hedgehog    GDXYM, HHG-3, SRXY7    desert hedgehog homolog|mutant desert hedgehog    DHH    desert hedgehog
ADD COMMENT
0
Entering edit mode

I don't think I have the protein GI, or rather I don't know how to retrieve a list of all GI's of the proteins I'm filtering. Are there any links for more information on how to use GI's and Entrez together with python?

ADD REPLY

Login before adding your answer.

Traffic: 1630 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6