I've been trying to run the Epitope Conservancy Analysis tool of IEDB, but when I submit the request it returns:
*Format of sequence not recognized*
My sequence is like this:
>KY449055.1_Foot_and_mouth_disease_virus_type_O
TTSTGESADPVTTTVEXYGGETQVQRRQHTDVSFILDRFVKVTPKDQIXVLDLMQTPAHT
LVGALLRTATYYFADLEVAVKHEGXLTWVPXGAPEAALDXTTXPTAYHKAPLTRLALPYT
APHRVLATVYXGXCKYGEGAVTXVRGDLQVLAQKAARTLPTSFXYGAIKATRVTELLYRM
KRAETYCPRPLLAIHPEQARHKQKIVAPVKQLL
What can I do to fix it?
Are you sure of what you doing?
OMG! Thank you for this note. I almost ruined my analyses... I didn't remember of asparagine! Anyway, if I replaced X and it worked, the problem still must be in the character X. I'm gonna try to replace for something else!
Thanks again!!
If I check the accession number you included in the original post it leads to a nucleotide sequence record for a partial CDS of a gene. The translation included in the record does NOT contain any X's. In fact there are N's in place of where you have X's in your sequence.
So you were doing the right thing but without checking the original sequence. Mystery is how did you end up with the incorrect sequence.
It's because I have a set of about 2000 sequences. Some of them present unknown amino acids. When I edited the sequences, replacing the X for N, the server accepted all the sequences. So, it leads me to believe the problem is the X.
We have established that without a doubt. X is supposed to represent "Unknown" per IUPAC code. Question is how did you end up with X's in this particular sequence, when there should have been N's according to the original NCBI record. Perhaps your other sequences have the same issue?
I checked the sequence I wrote as an example and there isn't X in it. When I was unduly replacing the letters some mistake may have happened.
I looked another sequence that presents X in its translation and the codon responsible for the X is GCS. Looking into the IUPAC code, S means 'G or C'. Both GCG and GCC results in alanine, but the translation tool doesn't consider it and writes X. In another sequence, there's a codon SST which also leads to X.
I see that you are from brazil so I assume the difference in your keyboard layout must be causing this. I see X's in the sequence posted in the original question.
I mean I checked the original file. The sequence posted in the original question is wrong, with improper Xs in it. I didn't change the original post to keep the logic of the replies.