Hi!
(I'm comletely new to unix and after being amazed by the capabilities of tools such as sed, awk and grep in the couple of previous days I'm now slowly trying to do something useful for my work:)
I'm trying to retrieve multiple protein FASTAs from GenBank using a list of protein accessions (such as "XP_015438716.1" which I have in a file, one accession per line; several tens to two hundreds accessions in total per file for which I would like to download the protein FASTAs) and save the FASTAs into one file. I would like to do this not via web (e.g. http://www.ncbi.nlm.nih.gov/sites/batchentrez) but via a command line (using bash commands or unix utilities) as I'd like to build this step into a pipeline which I try to construct.
I played with E-utils and particularly with efetch, which works fine for downloading a single protein fasta using e.g.:
efetch -db protein -format=fasta -id XP_015438716.1 > testEFETCH.fa
but I did not manage to use a file as an input for efetch (I'm wondering whether it is possible). I will appreciate any hints or help!
Thanks; this helped me a lot!