Could you, please, give your opinion about the next issue: my final work is connected with finding protein homology in human, golden hamster and mouse genes and here arises the following question: is there any way to represent topic of research as a mathematical problem (maybe not fully strong) ?
I've decided that my work can sound as "Prediction of conserved regions in a protein sequences of human, golden hamster and mouse" (I've omit some not important details). Also there is issue that golden hamster genome is not fully sequenced and so my work will have more theoretical than practical significance.
My specialization is applied math and informatics and so I have such requirement about topic of final work.
If you are planning on developing new algorithms for detecting homology that is certainly a mathematical problem. There already exists a wide body of literature on homology finding and various techniques for comparative genomics, synteny comparisons, phylogenetics, differentiating orthologs from paralogs, genome annotation by orthology, etc. All of those are very deep areas of research ranging from the more mathematical or computer science end of the spectrum, to simply using existing tools and algorithms to solve specific biological questions. Of course given how closely related your three organisms are, and how much depth and detail exist for both the human and mouse genomes (including lots of data on conservation of genomic regions between the two species), study of the chinese hamster genome, it is a problem very tractable with existing algorithms and software and leans more towards the biology end of the bioinformatics spectrum than the math. Of course if you want to develop new algorithms for this work, the human and mouse comparison is great to compare the performance of your algorithm to existing ones, and then use the chine hamster genome to make new predictions using your algorithm and software.
Thank you for the explanation. I've thought about the fact that there are many predicted proteins of golden hamster and what if i do some estimation of probability that predicted protein sequences of golden hamster may have conserved regions identical to the regions in human and mouse protein sequences? In this case i will even get some exact value.
So, what do you think about "Estimation of the existense probability of the similar conserved regions in a protein sequences of human, golden hamster and mouse"? Is that bad idea or not?
To be honest? Yes it sounds like a bad idea. It is rather straightforward to determine conserved and homologous regions between orthologous proteins, especially when the organisms are very closely related. We expect more than 90% of the protein-coding regions to be nearly identical between humans and chinese hamsters. I forget what the exact percent identity is between humans and mice for instance, but even at the genomic level the rough estimate is well over 90%.
Thank you for the explanation. I've thought about the fact that there are many predicted proteins of golden hamster and what if i do some estimation of probability that predicted protein sequences of golden hamster may have conserved regions identical to the regions in human and mouse protein sequences? In this case i will even get some exact value. So, what do you think about "Estimation of the existense probability of the similar conserved regions in a protein sequences of human, golden hamster and mouse"? Is that bad idea or not?
To be honest? Yes it sounds like a bad idea. It is rather straightforward to determine conserved and homologous regions between orthologous proteins, especially when the organisms are very closely related. We expect more than 90% of the protein-coding regions to be nearly identical between humans and chinese hamsters. I forget what the exact percent identity is between humans and mice for instance, but even at the genomic level the rough estimate is well over 90%.
Ok, you made things clear for me. Really thank you, Dan.