Hi there,
I want to get all the positions of the translation start sites for the human genes. Note it is different from the transcription start sites (TSSs). protein translates at the codon ATG.
For all TSSs, GENCODE gff file and refGENE file from UCSC have the information.
my questions is, can I look at every gene sequences start from the TSS to the TES(transcription end site), and look for the first ATG, and assume it is the translation start site.
Alternative start codon is rare in eukaryotic genomes, but may still exist.
What's your suggestions?
Thanks,
Ming
Thanks! Mark. by the way, I found your blog very helpful :)
Actually, the GENCODE gtf file has a feature for start_codon. Even nicer for me :)