Hi,
I am trying to produce a .phylip file output using the ClustalO/X systems. I have a test set of 7 viral genomes (30kb) and I have run them through both the default options of ClustalO and ClustalX.
The ClustalX software seems to produce a phylip file which works with other software such as FastTree.
However, ClustalO produces a slightly different output. The output from ClustalO does not work in FastTree and I get the following error:
No sequence in phylip line TCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCAC
As you can see from the two files, there are some minor differences which could be causing the problem.
ClustalO:
MT084071.1-------------------------------------------------- TCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCAC
ClustalX
MT084071.1 ---------- ---------- ---------- ---------- ---------- TCTTGTAGAT CTGTTCTCTA AACGAACTTT AAAATCTGTG TGGCTGTCAC
Is this a bug or is there a way to get ClustalO to produce 'correctly' formatted .phylip files such as ClustalX does.
Many Thanks
Can you double check your post? I'm not convinced your formatting is correct and representative of the actual files.
All versions of Clustal should produce compatible Phylip files AFAIK.
Hi, I have checked the input and the output. I know this seems silly to think the output would be different but it is.
The output snippets I uploaded are from the final line of the seq IDs. As you can see, the universal different is that there is no gap between the ID and then '---' in ClustalO and that ClustalX has spacing within the sequence lines.
ClustalO
ClustalX
I know it seems crazy but the outputs are different.
I've edited your post to fix the images, please double check I got them the right way around.
Based on those, ClustalO is not outputting a valid phylip. The spacing in the clustalx version is correct, it I can't say that I've ever experienced an issue with ClustalO, and indeed it's the newer and recommended tool.
I can partially recreate this, as when I run clustalo, it produces the 'unbroken' sequences, but does respect the space between ID and sequence start (though this may be because my test IDs are shorter than yours).
Can you share what version of Clustal this pertains to for each?
Yes you fixed it. Thank you.
I used the newest version listed here for Linux: http://www.clustal.org/omega/ 1.2.4 I also used the version from apt install clustalo which is listed at 1.2.4
I think the ID spacing is indeed down to length, I can shorten my ID lengths to fix that. It does seem to be the gaps in the sequence which are needed.
Forgot to add. ClustalX is the version from apt install which is 2.1
Thanks again.
I am sure it is a problem with how ClustalO formats its Phylip output. ClustalW produces the same correct output as ClustalX