Question

Kozak sequence strength calculation

3

Entering edit mode

6.2 years ago

Alexandros.Frydas ▴ 30

Hello everyone,

I want to assess the probability of a translation initiation site to be actually translated (Interesting in human genome). Based on literature I know that one of the most important requirements is a good Kozak sequence. Is anyone aware of a bioinformatics tool that could calculate Kozak concensus strength based on a given sequence?

Thanks in advance!

Alex

sequencing Kozak translation ribosome binding • 5.7k views

ADD COMMENT • link updated 3.2 years ago by Alec • 0 • written 6.2 years ago by Alexandros.Frydas ▴ 30

4

Entering edit mode

I won't post this as an answer since it doesn't answer the question as posed, but is perhaps some food for thought:

To the best of my knowledge, recent studies suggest that pretty much the whole chromosome is transcribed to some degree at any given time, but the levels are obviously modulated. If Shine-Dalgarno sequences in bacteria (closer to my area of expertise) are any indication, there is a relationship between sequence 'identity' and transcriptional/translational activity - however it's very complicated, as the sequence of the regulatory sequences is not the be-all and end-all.

There may well be existing literature which has benchmarked the transcriptional activity of different sequences, but the problem is that the data will be essentially incomparable between different experiments due to batch effects.

A quick google for kozak sequence effects on transcription turns up articles such as: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5563945/

and

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0108475

I'm certainly not aware of any tools that do this already. In principle I can see that it would work as a predicable phenomenon. Something to the effect of creating a standard curve of sequence distance from the canonical sequence (perhaps), versus transcriptional activity, but it would need to be based on a carefully curated reference set of data. I'm sort of thinking along the lines of a transcription initiation equivalent of the "N-end rule", or a similar kind of benchmarking to the study which showed translational response to different start codons (spoiler alert, ATG doesn't mean sh*t!)

The only thing that springs to mind otherwise, would be the Softberry site, which has several tools for promotor sequence analysis etc (though this usually only means predicting their locations in sequences etc), but maybe theres something there you can make work:

http://www.softberry.com/berry.phtml?topic=products&no_menu=on

ADD REPLY • link 6.2 years ago by Joe 21k

0

Entering edit mode

Hello Mr. Healey,

The two articles you referred me to are indeed some good extra knowledge and I have already gone through both.

Thanks a lot for sharing your thoughts regarding this issue. I will also have a look on the site you mentioned because I was not aware of it ( and whatever has to do with bioinformatic tools in general I am a newbie in this field)

Thanks a lot !

ADD REPLY • link 6.2 years ago by Alexandros.Frydas ▴ 30

0

Entering edit mode

Hello,

A little late, but this website may be what you are looking for:

https://www.tispredictor.com/tis

It calculates Kozak sequence strength (referenced as Kozak Similarity Score) for each predicted initiation codon in a given sequence.

Here is the associated paper:

https://www.biorxiv.org/content/10.1101/2021.08.17.456657v1

I would recommend reading “Kozak similarity score algorithm” and “KSS as a reference for likelihood of translation initiation” in the Results section to understand the scoring metric.

The paper is still a preprint, so I would also keep that in mind.

ADD REPLY • link 3.2 years ago by Alec • 0