Exonerate ** FATAL ERROR **: Unrecognised symbol 'J' (ascii:74)
1
0
Entering edit mode
10.5 years ago
jomaco ▴ 200

Hi,

I am receiving the following error when running exonerate-2.2.0:

** FATAL ERROR **: Unrecognised symbol 'J' (ascii:74)

Here are a couple of examples of where this occurs:

** FATAL ERROR **: Unrecognised symbol 'J' (ascii:74) file:[../../blastdb/evidence_aln_seqs/output2/evidence.seqmasked.cat.1-nopipe.split.4.fa] seq:[gi614497259gbAHX42233.1] pos:[267]

>gi614497259gbAHX42233.1
FNIFIRGLFENMKVEEAISVWQVLQEDGCVADSTTYGVLIHGFCKNGYLNKALWVLTDAE
DKGNDLDVFAYSSIINGLCNARRVDEGFDTLDRMTKHGCKPNAHVCNTLLNGLIRASKID
DAIRFFSEMAAKDCLXTVVTYNTLIDGLCKVERFDEAYLLVKEMLEKGWKPDIITYSLLM
RGLCQDKKIDMALNLWCQVVEKGLKPDVIMHNIIIHGLCSAGKVGDALQLYLRMSQCDCV
PNLVTLNTLMEGFYRARDCLNASVIWARJLRSGLKPDJISYNIVLKGLCSCIDY

** FATAL ERROR **: Unrecognised symbol 'J' (ascii:74) file:[../../blastdb/evidence_aln_seqs/output2/evidence.seqmasked.cat.1-nopipe.split.3.fa] seq:[gi614497313gbAHX42260.1] pos:[302]

>gi614497313gbAHX42260.1
IKGGFEIWELMGRDGVRNIFSFNILIRGLFESRKVEEAISVWQVLQENGCAADSTTYGVL
IHGFCKNGYLNKALLVLTEAENKRNDLDVFAYSSIINGLCNARRVDEAFDMIDRMAKHGC
KPNAHVCNTLLNGLIRASKIEDAIRFFTDMASKNCLPTVVTYNTLINGLCKAERFDVAYL
LVKEMLEKEWKPDIITYSLLMRGLCQGKKIDMALNLWCQVVEKGLKPDVIMHNIIIHGLC
SAGKVEDALQLYLKMSQCDCVPNLVTLNTLMDGFYKARDCLNASVIWARILRSGLKPDII
SYNJVLKGLCSCYRIS

Has anyone else had this problem? I don't understand why 'J' is an unrecognised character.

Hope you can help.

exonerate fatal-error character ascii-74 • 4.5k views
ADD COMMENT
3
Entering edit mode
10.5 years ago

J is a placeholder "used in cases where chemical or crystallographic analysis of a peptide or protein cannot conclusively determine the identity of a residue." It does not represent a proteinogenic amino acid (but an amibiguity between I and L) and exonerate does not have a translation table for it. Replace it either with leucine or isoleucine.

ADD COMMENT
0
Entering edit mode

Thanks, very informative. Perhaps it would be best to remove these sequences, rather than guess (I'm not sure which I would choose)? They form part of a large evidence set so are not necessarily required.

ADD REPLY
0
Entering edit mode

I would not remove it, unless you require from exonerate 100% agreement (this is not recommended BTW) or mismatch codon can ruin your analysis.

ADD REPLY
0
Entering edit mode

Sorry, to clarify, I mean to remove the whole sequence (i.e. any sequence containing "J") in my dataset. These sequences are part of a set of over 500,000 protein sequences from asterids species related to my species of interest, to be used as evidence alignments, so it should not affect the analysis too much. This should be OK? (as apposed to removing just the character itself). Thanks again

ADD REPLY

Login before adding your answer.

Traffic: 1935 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6