Question

How to obtain Coding sequence from Genomic sequence?

0

Entering edit mode

11.3 years ago

MAPK ★ 2.1k

Hi Guys,

I have some assembled genomic sequences and I know the exact frames I need to translate to obtain coding sequence and putative protein sequences, but I do not know exactly where to chop the translated frame and hence the coding region and splice sites for intron and exon boundaries. Is there a way to get coding region from given frames? Please share your knowledge.

Thank you!

coding genome • 2.8k views

ADD COMMENT • link updated 3.7 years ago by Ram 45k • written 11.3 years ago by MAPK ★ 2.1k

0

Entering edit mode

Please show what you have; if you know the exact frame, then that's where to cut!

ADD REPLY • link 11.3 years ago by karl.stamm 4.1k

0

Entering edit mode

Thank you for your reply, Karl. When I say assembled the genomic sequences are still unannotated and there are multiple scaffolds. Suppose I have three scaffolds for any particular protein coding gene and if I have to translate the scaffolds in two different frames each. After merging all the translated frames, I will have an unreasonably long peptide sequence/coding region. That is because the frame gets translated further beyond the 'GT' and 'AG' boundary and hence includes the non-coding regions as well. In this case I think I need to have transcriptomic sequences to unambiguously infer the coding regions. Please clarify if otherwise. Thanks again!

ADD REPLY • link updated 3.9 years ago by Ram 45k • written 11.3 years ago by MAPK ★ 2.1k

Ram · Answer 1 · 2014-10-02

2

Entering edit mode

11.0 years ago

Manvendra Singh ★ 2.2k

There is an easy ways to do it .

It's Coding Potential Calculator (CPC).

It takes input as fasta sequence.

Once you submit your fasta file, you get CP score for each sequences. >1 is potential protein coding and < -1 is potential non-coding RNA

CPC can reliably discriminate the coding and non-coding transcripts in ~98% accuracy.

ADD COMMENT • link updated 3.7 years ago by Ram 45k • written 11.0 years ago by Manvendra Singh ★ 2.2k