Question

Can someone help me explain this part of a lecture regarding gene finding HMM's

0

Entering edit mode

17 months ago

Mark ▴ 30

I'm trying to learn about computational biology and this professor is describing an example of how a gene finding prediction model works. He then proceeds to explain why models should be designed to predict along the plus and minus strand simultaneously instead of designing a single strand prediction model and then running it on both strands. He says that false positives may arise because codon frequencies are non random in both strands when you are in a gene region because the codons on one strand cast a "shadow" on the other strand and can fool the predictor into giving you false positive hits for genes when it doesn't have access to the other strands information. I don't understand this at all because if the opposite strand is non-coding, why would the codons on the gene containing strand "cast a shadow" on the other strand that could fool a prediction algorithm? Shouldn't the base pair triplets on the other strand be nonsense that wouldn't contain any gene patterns and not fool a gene prediction HMM?

This is the link to the part of the lecture I'm talking about: https://youtu.be/uBZAWM612_E.

I don't understand this "codon shadow" or codons linked at the 3rd position idea. Can someone please help clarify this for me?

HMM gene-prediction • 605 views

ADD COMMENT • link updated 17 months ago by Ram 45k • written 17 months ago by Mark ▴ 30

2

Entering edit mode

take a moment to validate or comment all the questions you asked before:

ADD REPLY • link 17 months ago by Pierre Lindenbaum 165k