Hello!
I'm very uncomfortable, but could you help me again. Let's see at this link http://www.ncbi.nlm.nih.gov/protein In menu Search (bee) AND "Apis mellifera"[porgn:__txid7460]. And choose 'RefSeq ( 10618 )' right of the screen (in the Filter menu).
We have 10 618 records. Let's click one of them. In here http://www.ncbi.nlm.nih.gov/protein/NP_001011614.1 we can see amino acid code of the protein. But click CDS link.
Ok. Now we can see nucleotide code of the protein. http://www.ncbi.nlm.nih.gov/nuccore/58585171?from=135&to=638&report=gbwithparts
Could you tell me, is it true that this code always starts with a triplet 'atg' and end termination?
Another question. There are 10618 records in this site. I need all of them in one ore some files. Do you know, BioPerl ( bioPython ) or Entrez Utilities can help to get it? Wich script? May be another programm? I need nucleotide code for all proteins foe each organism.
Thank.
Sorry for bad english.
Gleb, I would suggest you read the Transcription and Translation (for genetics) articles on Wikipedia and the NCBI Eutils manual (all easily accessible via google).
I understand. Thank you. I just thought it was a simple little problem: how to download CDS... Probably not.
It is reasonably simple once you worked with the databases a couple of times. However, telling you how every step of how to do it (or writing a script for you that does it) might cause more harm than good when you need a slightly different dataset the next time. You are of course welcome to ask questions here when you are stuck.