I downloaded 100s of gene orthologs from ENSEMBL using FASTA format. I have headers that look a lot like this one:
>EMLSAP00000005133
or this one:
>ENN75927
I would like my headers to have the following format:
>Scientific_name_geneName_EnsemblID
Based on my research, I've seen that using the prefixes from the ensembl IDs, I should be able to find the scientific names. I have also seen that I could use BiomaRt in order to find the gene name. However, I am having trouble using BiomaRt in the command line and being able to automatically transform the hundreds of headers from the different files.
Could someone help me? Is there a way of doing all of this (scientific name and gene name) in the same step or is this approach the correct one?