I have a user who downloaded the protein FASTA for RefSeq 78. Now they want to use a tool which needs both the CDS and protein FASTAs, but I can't find any simple way to download it. A complex way is take all the protein accessions from the FASTA, get mRNA accessions for them somehow, then pass those accessions to efetch (which fortunately allows getting old revisions of the CDS), like:
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=<insert_100,000_human_transcripts_here>&rettype=fasta_cds_na&retmode=text
I could do that in batches of 10000 (via POST rather than GET). But this seems hacky. Is there a better way? (that isn't doing essentially the same thing with BioMart)
RefSeq 78 for human?
In this case yes, but a solution that works for any RefSeq species would be better. Thanks.