Question

How to translate CDS to an amino acid sequence

0

Entering edit mode

2.5 years ago

Max • 0

How to translate CDS to amino acid sequence? This site shows CDS and translations but no explanation on how to get there.

translation CDS sequencing • 4.1k views

ADD COMMENT • link updated 22 months ago by vvasta • 0 • written 2.5 years ago by Max • 0

0

Entering edit mode

the link you are posting provides the translated amino acid sequence for the CDS. If you had just the DNA letters, and wanted to translate, you can paste into something like https://web.expasy.org/translate/

ADD REPLY • link 2.5 years ago by cmdcolin ★ 4.0k

0

Entering edit mode

I tried several of the online DNA translation tools but have not found one that would output DNA/prot alignment with nucleotide number with DNA /prot aligned line by line

Thanks

Valeria

ADD REPLY • link 22 months ago by vvasta • 0

score 0 · Answer 1 · 2022-05-30

Each CDS has a link with the tag "/protein_id", which leads to the corresponding GenBank protein entry.

If what you want is to get all of the proteins into a single FASTA file, go to the Related Information box at right and click on Protein. This will bring up a list of all proteins in the viral genome. Select all proteins, and then open "Send to" at upper right. Choose File, Format FASTA, and Create File. This will save your file to disk.

Note that some of these CDS features include gaps (multiple N's) representing regions of this strain for which sequence is not available. These are represented as X's in the proteins.

This entire process can be automated with a few clicks using the BIRCH system. The Sequence Dataset tutorial includes automated extraction of CDS features, followed by translation of DNA to protein.

score 0 · Answer 2 · 2022-05-30

0

Entering edit mode

2.5 years ago

Jeremy ▴ 930

Click on FASTA in your link to get the FASTA nucleotide sequence. You can then copy and paste that sequence into SnapGene Viewer, then click on the "Show translations" button on the left. You can choose which reading frames to view, but SnapGene will color the predicted reading frame in orange.

Alternatively, you can use translate() in the Bioconductor Biostrings package in R. You can trim 1 or 2 nt to change the reading frame. You'll first need to convert your nt sequence to a DNAStringSet. You can use something like the following:

seq = DNAStringSet('[your sequence]')
aa = translate(seq, if.fuzzy.codon = 'solve')
aa

SnapGene Viewer (free)

Biostrings

ADD COMMENT • link 2.5 years ago by Jeremy ▴ 930

0

Entering edit mode

But what if I wanna translate them by myself? Is there a universal translation table or something similar?

ADD REPLY • link 2.5 years ago by Max • 0

0

Entering edit mode

there are multiple translation tables or "genetic codes". here is a good link showing many of them. almost all life uses the "standard code" but there are interesting exceptions https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi

ADD REPLY • link 2.5 years ago by cmdcolin ★ 4.0k

0

Entering edit mode

I guess the standard table should work for all cases really well?

ADD REPLY • link 2.5 years ago by Max • 0

0

Entering edit mode

Here is a Wikipedia article with codon tables. Any biochemistry textbook will also include a codon table. Remember that the CDS starts with methionine (M) and ends with a stop codon.

DNA and RNA Codon Tables

ADD REPLY • link 2.5 years ago by Jeremy ▴ 930

0

Entering edit mode

The link which you send is invalid but I think you wanted point to this site? How do I know which table I need?

ADD REPLY • link 2.5 years ago by Max • 0

0

Entering edit mode

Yes, that's the right site. I believe you're interested in Covid19 in humans, so you can use the "Standard DNA codon table".

ADD REPLY • link 2.5 years ago by Jeremy ▴ 930