Hello,
I am trying to generate a nucleotide motif that will code chosen amino acids. For example - histidine is coded by CAT, CAC. Arginine is CGT, CGC, CGA, CGG,AGA and AGG. The pattern is:
1. position in codon - C or A
2. position in codon - A or G
3. position - A, T, C or G
by that rule you can define chosen amino acids (H and R) but also the amino acids that i dont want (for example AAA is lysine, AAT is asparagine...). So I need to define the pattern that matches only my chosen AAs, in case above it can be: [C][A or G][T], that pattern defines only histidine and arginine, but not the other amino acids. I am trying to work out an algorithm which will do this thing with any amino acids which i choose (more than two) and if the pattern does not exist it should find the possibilities for less amino acids (for example if pattern for 5 amino acids does not exist, it will find the patterns for four amino acids from the query) - this final optimization problem is probably the hardest part. Any suggestions? Thanks a lot and sorry for my poor english.
Hello tretyacv!
It appears that your post has been cross-posted to another site: http://stackoverflow.com/questions/27603128
This is typically not recommended as it runs the risk of annoying people in both communities.
Yes, that true, I'm sorry, I thought that stackoverflow is more about computer science and biostars is bioinformatics site, so the communities does not overlap.
You are correct in assuming that biostars specializes in bioinformatics, but people here also have a presence on stack overflow because geekiness has no boundaries :)