The problem I am facing is that I am getting errors like Warning: Sequence 280 "DNEG10010009 " does not begin with a recognised start codon. Warning: Sequence 280 "DNEG10010009 " has 16 internal stop codon(s)
The input I have provided is a sequence of Nucleotide sequences for each individual gene of Bacillus Subtilis 168.
My question is how do I solve this error? Actually all I really want to do is to calculate various values like codon adaptation index,codon bias index,number of optimal codons and so on for each gene of Bacillus Subtilis 168 .For that I am using the software CODONW.I would also like to know if the sequence that I am providing as input is correct or not and if not what should I provide as input.
The other doubt I have is that do I input Open Reading Frames for Codon Analysis.If a particular gene sequence has multiple ORF's which one should be chosen?
Here are some sequences for which I am getting an error: Please tell me how to solve this error.I think I need to convert them to ORF's before giving them as input but I am not sure.The errors I am getting are only for NonEssential genes of Bacillus Subtilis 168.
>DNEG10010082
TTGATAGGGCAGAAAGCTTGGGTGAACATTGGCAAGACCGAATTCATCTTGCTTCTTGTC
GTTGGAATTTTAACCATCATCAATGTACTAACAGCAGACGGAGAAAAGCGTACATTTCAT
TCTCCTAAGAAAAAGAATATCAATCATTTAACCCTTTATGATTGCGTATCTCCGGAAGTT
CAGAACAGTATAAACGAAACAGGGCGTGTGACAAACTTCTTTTGA
>DNEG10010083
ATGAATCAAAATCAGTTGATATCGGTAGAGGATATCGTATTTCGATATCGGAAGGACGCA
GAAAGACGAGCACTAGACGGCGTCTCCCTGCAGGTGTATGAGGGTGAATGGCTTGCAATC
GTAGGTCATAACGGTTCAGGGAAATCAACACTGGCCCGGGCATTGAATGGTTTAATTCTT
CCTGAATCAGGCGACATTGAGGTTGCCGGGATTCAATTGACAGAGGAATCTGTTTGGGAA
GTGCGTAAGAAGATAGGTATGGTCTTTCAAAATCCGGATAACCAATTTGTCGGAACGACT
GTTCGCGATGATGTGGCTTTTGGTTTAGAAAACAATGGTGTACCGCGGGAAGAAATGATT
GAGAGAGTAGACTGGGCAGTAAAACAGGTGAATATGCAAGATTTTCTCGATCAAGAGCCG
CACCATCTCTCCGGAGGCCAAAAGCAGAGAGTTGCGATTGCGGGGGTTATTGCCGCACGT
CCTGATATTATTATCTTAGATGAAGCAACATCCATGCTTGATCCGATCGGGCGAGAAGAA
GTGCTTGAAACGGTAAGACATTTAAAAGAGCAGGGCATGGCGACTGTCATATCCATTACA
CATGACCTGAATGAGGCAGCAAAAGCAGACAGGATCATTGTCATGAATGGCGGTAAAAAA
TATGCTGAAGGGCCGCCTGAAGAGATTTTTAAATTGAATAAAGAACTTGTTCGAATTGGG
CTTGATTTACCCTTCTCATTCCAGCTTAGCCAGCTTTTAAGAGAAAATGGACTGGCTTTG
GAAGAAAACCATTTGACTCAGGAAGGGCTGGTGAAAGAGCTGTGGACATTACAATCAAAG
ATGTAG
What codons does CodonW consider as valid? I would expect that the program is able to handle codons that are not ATG as a start codon, so it would suggest your sequences are malformed in some way.
Some things to check off the top of my head: - Are you uploading/running with the right sequence format (does it expect fasta headers for example? - it may be tripping up on the ">" character) - Check your sequences are what you think they are (blast a couple perhaps? to ensure you have the full sequence covered) - As Istvan suggested, perhaps you're out of frame, or missing parts of the sequence.
You should upload some examples of the sequences you're trying to use before we can help you properly though (can't say anything for certain without seeing what you're working with).
You need to show us the sequence DNEG10010009 which is the one the program is complaining about in the error you copied.
Those 2 sequences, at least, are fine as far as I can tell. DNEG10010083 starts with an ATG, which is the canonical start codon, and DNEG10010082 starts with TTG which is also a legitimate start codon. So unless CodonW isn't able to understand start codons other than ATG (which I would find pretty unlikely if the software is even remotely decent), then it must be a problem with (at least) the DNEG10010009 record.