I have the amino acid sequence of a gene of interest. I want to identify all the different homologs in different species, and assemble it in a single FASTA file format. Which I then want to input into a sequence similarity network (SSN) tool, to find how the different homologs from different species cluster.
My problem is: how do I get this list of homologs into one FASTA file? I tried two approaches. One is find the protein on Kegg database, then I click on 'Orthologs'. This gives all the homologs in the KEGG database (about 2500). But I don't see a way to download the amino acid sequences in one go, unless I click each homolog indvidually and copy the aa sequence.
The second option is to do a BLASTP usign RefSeq as the search database. This gives me only a limited number of hits (restricted to 100), and furthermore, I still can't figure out how to download the FASTA sequences of the hits in one go.
Can someone please offer some help? Thanks in advance!
what kind of species are we talking about here? if tit's plants (or "related") you should have a look here: PLAZA ; the whole purpose of this resource is exactly homology, gene families etc ...