Entering edit mode
5.2 years ago
Quak
▴
520
I have a set of species "Bifidobacterium longum" and "Bifidobacterium breve" and I would like to see - computationally, what are the peptides, bioactive molecules, protein that these strain secrete. I was wondering if there is any tool / pipeline out there or if you can point me to a direction where I can extract these informations.
...and which kind of data do you have?
I just have the genome sequence from NCBI
If you simply go to NCBI http.nlm.hih.gov and type
Bidobacterial proteins from NCBI, you get their list.
https://www.ncbi.nlm.nih.gov/protein/?term=bifidobacterial+proteins+from+NCBI
Some time ago, when we had *.gbk-files, it was easy to
find protein archieve there. I have somewhere an old version
of NCBI; if the strains are old enough probably their genomes
did not change a lot. If you need that old version of NCBI, I found it for you.
See my answer to the post for more information about old NCBI-version. I hope the link is still valid.
The link is valid. If you look at *.faa files there, they are all your proteins from a particular strain.
ftp://ftp.ncbi.nlm.nih.gov/genomes/archive/old_refseq/Bacteria/Bifidobacterium_breve_ACS_071_V_Sch8b_uid158863/
A: where can I get environmental bacteria genome in fasta format (as many as possib
interesting ! looks like a lot of scraping needed ...
You won't get these answers from any singular resource.
You can easily identify what proteins are available from the genome annotation, but secreted molecules are often largely unknown. The only resource I know of myself is
antismash
: https://antismash.secondarymetabolites.org/#!/startFrom this you can infer what types of molecules it might secrete.
What would you use for bacterial genome annotation ? Any existing pipeline ?
Prokka
(https://github.com/tseemann/prokka) orPGAP
(https://github.com/ncbi/pgap)