finding missing genes in annotated draft genomes
2
1
Entering edit mode
10.0 years ago
robjohn70000 ▴ 160

Hi,

What is the best bioinformatics approach to find missing genes and their CDS coordinates in annotated draft genome,

Thanks

sequencing Assembly • 2.7k views
ADD COMMENT
1
Entering edit mode
10.0 years ago
seidel 11k

The answer depends explicitly on how the "known" (i.e. non-missing) genes were found (your question needs more detail, how was the current annotation generated? What's your definition of "missing"?). However, two approaches that come to mind: find RNA-Seq data and either assemble it (e.g. Trinity) or take it through the Tuxedo suite (tophat/cufflinks). That will give you genes not likely to have been found in your current set. The second approach would be to simply use an alternate gene predictor, something different than what gave you your current set.

ADD COMMENT
0
Entering edit mode

Thanks Seidel. The annotation was done using RAST. Sequences of the supposedly "missing" from a reference genome were blasted against the contigs of the annotated draft genome. Although, good matches between the sequences of reference genome genes and the draft genome contigs were found, it's unclear how to link the match positions to to the RAST CDS coordinates.

ADD REPLY
1
Entering edit mode
8.7 years ago

Hi, this question has been raised more than one year ago. However, if anyone is still interested in this topic, here is another answer: Homology-based gene prediction might be useful in your case. Several options have been stated in this thread: Repairing Old Genomes With Homology Based Gene Prediction

ADD COMMENT

Login before adding your answer.

Traffic: 2031 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6