Entering edit mode
5.3 years ago
151831
•
0
Please, help me with my Project, I have to do an R-program to predict genes in eukaryotes with homology. I didn't find much information.
Please, if I could recommend a book or explain step by step how to do this task. I would appreciate it very much.
Greetings from Peru!
Why R?
Unless there’s already a Bioconductor package for gene prediction, R is not the ideal language to do a task like this in.
Do you have to implement a tool from scratch or can you use existing tools?
I need to implement a tool from scratch, basically implement a package for R. I have only 12 hours to do it, actually I have searched a lot of information, but they do not cover this topic deeply, I only know that BLAST could make the comparison because the project involves using homology, but I do not know how to start, I have many problems, first to do it by homology is necessary a reliable database and I only have a simple pc.
If it were possible to give me an idea of the process by pseudocode or maybe you could recommend me books.
Sorry, for my bad English, I'm a native spanish speaker :(
This sounds like a course assignment. If so the problem must have been simplified otherwise as the topic is not simple I doubt anyone could write a scientifically sound program from scratch in 12 hours. Have you searched for articles on the topic? One of the first hits is this paper whose introduction points to various other methods and tools. Some possible approaches to the problem:
In any case, there'll be quite some pre- and post-processing to do.
You can also get some inspiration from the Ensembl coding gene annotation pipeline.. One of the steps is the use of GeneWise to directly align protein HMMs to genomic DNA.
There are different ways to go about this task and already many tools for it. For a general intro to eukaryotic gene prediction, see the review 'A beginner's guide to eukaryotic genome annotation'. To find homology-based methods, google for homology-based gene finding.