I would like to convert a 'clustal' formatted alignment into a 'nexus' format alignment in python. I then would like to use the nexus file to build a phylogenetic tree using Mrbayes (http://mrbayes.sourceforge.net/). How can I do that?
I would like to convert a 'clustal' formatted alignment into a 'nexus' format alignment in python. I then would like to use the nexus file to build a phylogenetic tree using Mrbayes (http://mrbayes.sourceforge.net/). How can I do that?
With Biopython you can (assuming you have it downloaded and installed)
$ python
> from Bio import AlignIO
> AlignIO.convert("file.clustal", "clustal", "file.nexus", "nexus")
It is possible to delete the quotes in the headers (Seq_name) of the output nexus file?
I would like to use AlignIO from BioPython for this approach.
Example:
#NEXUS begin data; dimensions ntax=3 nchar=50; format datatype=dna missing=? gap=-; matrix 'Seq_name' AGGGA 'Seq_name' AGCGG 'Seq_name' ACTGG
MrBayes gives me an error because of the quotes.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Does it have to be Python? There are lots of conversion utilities already available, both online e.g. here and command-line e.g. here
I would like to do this conversion for many files, so I wanted to automate it with a script. As I am using python for the multiple sequence alignments, it would be nice if I can use python for the format conversion as well. The Nexus created by NCLconverter that you also recommended is not usable by MrBayes 3.1.
trimal does an excellent job at file conversions, it is command line and can be incorporated into your script. It is designed for trimming alignments, but if you omit trimming options, it merely can be used as a convertor.
@akoik063 What would be the command for just the conversion? I am checking it out.
trimal -in $file -out $file <options>
And part of options is:
So, from clustal to nexus: trimal -in $file -out $file -nexus
Trimal works! Thanks!
Should be easy enough to incorporate an existing command-line tool into your script; seqret from the EMBOSS suite may be another option. I didn't recommend NCLconverter, it was just an example.
Do you recommend seqret?
I recommend the EMBOSS suite in general, as a great set of small tools, easily incorporated into scripts and pipelines in the "UNIX tradition". I know seqret is a good converter; as to whether it does exactly what you want, you'll have to try it and see!