Do you know where download protein sequences?
1
0
Entering edit mode
5.8 years ago
ygxing1 • 0

Hi, everyone! Do you know where download protein sequences? Up to now, I had downloaded NCBI NR database and Uniprot. Except for these famous databases, are there any other databases? Thank you very much! Best wishes!

sequence • 1.2k views
ADD COMMENT
0
Entering edit mode

There are a lot of posts here about that. Like this one:

Difference between NCBI non-redundant and refseq database

Go to the left-hand upper corner, press "LATEST"

and insert your question in the middle of the page.

Everything depends upon your goal, your species, etc

Human and bacterial proteins are the most studied ones.

How to download database of Human protein sequences with sub cellular locations?

Looking For A Database Of All Proteins Expressed/Predicted In Completely Sequenced Genomes

And there may be a lot of details:

Do you need some unique protein database? Are you interested in some particular

organellas, like mitichondria? Or some protein domain or motif?

Extracting Sub-cellular location from Uniprot into tabular format

Download all bacterial proteins from the same family

how to get protein motif sequence from pfam database?

Do you need a curated protein database ot it doesn't matter?

Do you need some enzymes?

A: extract EC number from entrez esearch query

Do you need to do it computationally or manually?

Etc,

ADD REPLY
0
Entering edit mode

Thanks a lot for your reply! I am collecting protein sequences data as much as possible. I had done searching as you say above but did not found some valued ones.

ADD REPLY
0
Entering edit mode

What exactly didn't you find?

ADD REPLY
0
Entering edit mode

for example, metagenome protein sequence database.

ADD REPLY
0
Entering edit mode
5.8 years ago
GenoMax 147k

I am not aware of a metagenome protein sequence database since many times the information you are getting from this type of sequencing is incomplete. There are now recent advances being made with metagenome sequenced genomes but that information is likely not fully validated.

There are papers like (https://www.nature.com/articles/sdata2017203, https://www.nature.com/articles/s41587-018-0008-8, https://www.ncbi.nlm.nih.gov/pubmed/30320765 as examples ) which describe metagenomic genome assemblies. You would likely need to download these genomes yourself and make databases from them. It would not be a trivial exercise. e.g. https://www.ncbi.nlm.nih.gov/bioproject/PRJNA482748 I am not sure if these sequences make it into gene/protein sections of GenBank at some point.

ADD COMMENT

Login before adding your answer.

Traffic: 2246 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6