Can someone help me explain this part of a lecture regarding gene finding HMM's
0
0
Entering edit mode
13 months ago
Mark ▴ 20

I'm trying to learn about computational biology and this professor is describing an example of how a gene finding prediction model works. He then proceeds to explain why models should be designed to predict along the plus and minus strand simultaneously instead of designing a single strand prediction model and then running it on both strands. He says that false positives may arise because codon frequencies are non random in both strands when you are in a gene region because the codons on one strand cast a "shadow" on the other strand and can fool the predictor into giving you false positive hits for genes when it doesn't have access to the other strands information. I don't understand this at all because if the opposite strand is non-coding, why would the codons on the gene containing strand "cast a shadow" on the other strand that could fool a prediction algorithm? Shouldn't the base pair triplets on the other strand be nonsense that wouldn't contain any gene patterns and not fool a gene prediction HMM?

This is the link to the part of the lecture I'm talking about: https://youtu.be/uBZAWM612_E.

I don't understand this "codon shadow" or codons linked at the 3rd position idea. Can someone please help clarify this for me?

HMM gene-prediction • 518 views
ADD COMMENT

Login before adding your answer.

Traffic: 1536 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6