Question

How To Get A Full Proteome Of Helicobacter Pylori 26695.

2

Entering edit mode

14.2 years ago

Marcinmagnus ▴ 80

Do you know any other place than UniProt to get this kind of data. I would like to get a file with all sequences without much hassle.

I used http://www.uniprot.org/uniprot/?query=organism%3A%22Helicobacter+pylori+26695%22&sort=score query and I got 2 proteins :|

protein uniprot • 4.6k views

ADD COMMENT • link updated 14.2 years ago by Marina Manrique ★ 1.3k • written 14.2 years ago by Marcinmagnus ▴ 80

0

Entering edit mode

helicobacter AND pylori AND strain:26695 gives better result. However I'm still not quite sure if the procedure is correct.

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 14.2 years ago by Marcinmagnus ▴ 80

0

Entering edit mode

maybe you could try taxonomy:"Helicobacter pylori" so you could get all the proteins for all the H. pylori when was the H. pylori 26695 genome sequenced? If it's too recent maybe proteins are not in the Uniprot db yet

ADD REPLY • link 14.2 years ago by Marina Manrique ★ 1.3k

Ram · Answer 1 · 2011-01-27

3

Entering edit mode

14.2 years ago

Lars Juhl Jensen 11k

I would get such data from NCBI RefSeq. The directory for Helicobacter pylori 26695 is:

ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Helicobacter_pylori_26695_uid57787/

There you can find sequences in a variety of formats. What you need is probably NC_000915.faa, which is a FASTA file with all the translation products (proteins).

ADD COMMENT • link 14.2 years ago by Lars Juhl Jensen 11k

0

Entering edit mode

It's great! It is exactly what I wanted.

I got the same number of protein (at least close) when I used http://www.uniprot.org/taxonomy/?query=strain%3A26695&sort=score

What do you think? Is it significant different. I had already proteins from UniProt in my local database. Should I stick to them or should I download data from NCBI RefSeq?

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 14.2 years ago by Marcinmagnus ▴ 80

0

Entering edit mode

http://www.uniprot.org/uniprot/?query=organism:210+keyword:181 would give the heliobacter_pylori complete proteome as defined by uniprot. However, I am not sure which strain that would be. Will ask around

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 14.2 years ago by Jerven ▴ 660

0

Entering edit mode

Which database to use is largely a subjective choice. It is difficult to know up front if the proteome provided by UniProt is better or worse than that provided by RefSeq. The main advantage that I see of using RefSeq is that it is based on a specific fully sequenced genome, for which reason I can be sure that it is a complete proteome. UniProt - not being a genome database - might in some cases give you a very partial proteome. But I guess you will have to judge on a case-by-case basis.

ADD REPLY • link 14.2 years ago by Lars Juhl Jensen 11k

Ram · Answer 2 · 2011-01-27

1

Entering edit mode

14.2 years ago

Science_Robot ★ 1.1k

Do you have to use UNIPROT?

I searched for your strain in NCBI's taxonomy database and got this page.

On the right there are several links for Nucleotide, Protein, Genomes, etc... Click "Protein" and arrive here

If you want to download the sequences, click 'send to'.

ADD COMMENT • link updated 5.6 years ago by Ram 45k • written 14.2 years ago by Science_Robot ★ 1.1k

score 1 · Answer 3 · 2011-01-28

1

Entering edit mode

14.2 years ago

Alexandra Louis ▴ 10

what about integr8 at the ebi? http://www.ebi.ac.uk/integr8/FtpSearch.do?orgProteomeId=23 all the proteome set is available.

ADD COMMENT • link 14.2 years ago by Alexandra Louis ▴ 10

Ram · Answer 4 · 2011-02-01

0

Entering edit mode

14.2 years ago

Marina Manrique ★ 1.3k

I'd do this. using the advanced search

Search by organism "Helicobacter pylori 26695"
And then search by keyword = Complete proteome

Here you can find more info about what "Complete proteome" keyword means

ADD COMMENT • link updated 5.4 years ago by Ram 45k • written 14.2 years ago by Marina Manrique ★ 1.3k