maybe you could try taxonomy:"Helicobacter pylori" so you could get all the proteins for all the H. pylori
when was the H. pylori 26695 genome sequenced? If it's too recent maybe proteins are not in the Uniprot db yet
There you can find sequences in a variety of formats. What you need is probably NC_000915.faa, which is a FASTA file with all the translation products (proteins).
What do you think? Is it significant different. I had already proteins from UniProt in my local database. Should I stick to them or should I download data from NCBI RefSeq?
ADD REPLY
• link
updated 5.2 years ago by
Ram
44k
•
written 13.8 years ago by
Jerven
▴
660
0
Entering edit mode
Which database to use is largely a subjective choice. It is difficult to know up front if the proteome provided by UniProt is better or worse than that provided by RefSeq. The main advantage that I see of using RefSeq is that it is based on a specific fully sequenced genome, for which reason I can be sure that it is a complete proteome. UniProt - not being a genome database - might in some cases give you a very partial proteome. But I guess you will have to judge on a case-by-case basis.
helicobacter AND pylori AND strain:26695 gives better result. However I'm still not quite sure if the procedure is correct.
maybe you could try taxonomy:"Helicobacter pylori" so you could get all the proteins for all the H. pylori when was the H. pylori 26695 genome sequenced? If it's too recent maybe proteins are not in the Uniprot db yet