Hi all,
We are trying to make protein database of multiple organisms say E. coli, T. ferroxidans, B. subtilus, etc. This is what we want to use for matching our orbitrap output and we want to do that only with those species which we have found through Illumina sequencing. These are approximately 400+ genera. So, can you suggest any smart way of doing so? Like I provide the names of organisms and retrieve single fasta file?
Thank you very much!
You can use @5heikii's script here.
cat
ing the individual fasta genome proteins files into a giant one afterwards should be a simple task.Note: See new answer/commnet below.
running this code didn't generate any fasta file. Although both the list of species (species.txt) and assembly_summary.txt are is same folder. Am i missing something?