Entering edit mode
7.5 years ago
Karyo
▴
10
Hi, I am looking for a tool or a way that I can find a full ORF/gene that starts from a start codon, ATG, and ends with any stop codons in a genome when a partial nucleotide sequence of that genome is given. I can find the location of the partial sequence on the genome with BLASTN, but I think up- and down-streams of this matched part will have a start and stop codon. Is there any tools availalble? I can only find gene annotation tools.
Similar questions get asked here often, but you need to be more specific, do you want to find any ORF or do you want to extract the sequence of a homologue? If so, exonerate is most flexible to extract the best path from a given protein or transcript sequence.
Thank you for the comment, actually I also obtained the partial sequence by exonerate using protein2genome option and a query as a peptide beginning with M and *(stop sign). But the result was without ATG and stop condon. So, I want to extend the matched genome sequence up and down-stream to find if there is any start or stop codons present.
Try to play with the optimize global parameters of exonerate. If that does not give you a complete path, it is unlikely, that the complete CDS can be reconstructed. I am not sure however, how it deals with cases where the sequence is fragmented over several contigs.
look at this link: https://a-little-book-of-r-for-bioinformatics.readthedocs.io/en/latest/src/chapter7.html