Convert Clustal To Nexus Format In Python
2
0
Entering edit mode
10.7 years ago

I would like to convert a 'clustal' formatted alignment into a 'nexus' format alignment in python. I then would like to use the nexus file to build a phylogenetic tree using Mrbayes (http://mrbayes.sourceforge.net/). How can I do that?

alignment python tree • 8.7k views
ADD COMMENT
1
Entering edit mode

Does it have to be Python? There are lots of conversion utilities already available, both online e.g. here and command-line e.g. here

ADD REPLY
0
Entering edit mode

I would like to do this conversion for many files, so I wanted to automate it with a script. As I am using python for the multiple sequence alignments, it would be nice if I can use python for the format conversion as well. The Nexus created by NCLconverter that you also recommended is not usable by MrBayes 3.1.

ADD REPLY
0
Entering edit mode

trimal does an excellent job at file conversions, it is command line and can be incorporated into your script. It is designed for trimming alignments, but if you omit trimming options, it merely can be used as a convertor.

ADD REPLY
0
Entering edit mode

@akoik063 What would be the command for just the conversion? I am checking it out.

ADD REPLY
2
Entering edit mode

trimal -in $file -out $file <options>

And part of options is:

   -clustal                 Output file in CLUSTAL format
    -fasta                   Output file in FASTA format
    -nbrf                    Output file in NBRF/PIR format
    -nexus                   Output file in NEXUS format
    -mega                    Output file in MEGA format
    -phylip3.2               Output file in PHYLIP3.2 format
    -phylip                  Output file in PHYLIP/PHYLIP4 format

So, from clustal to nexus: trimal -in $file -out $file -nexus

ADD REPLY
0
Entering edit mode

Trimal works! Thanks!

ADD REPLY
0
Entering edit mode

Should be easy enough to incorporate an existing command-line tool into your script; seqret from the EMBOSS suite may be another option. I didn't recommend NCLconverter, it was just an example.

ADD REPLY
0
Entering edit mode

Do you recommend seqret?

ADD REPLY
0
Entering edit mode

I recommend the EMBOSS suite in general, as a great set of small tools, easily incorporated into scripts and pipelines in the "UNIX tradition". I know seqret is a good converter; as to whether it does exactly what you want, you'll have to try it and see!

ADD REPLY
4
Entering edit mode
10.7 years ago
gammyknee ▴ 210

With Biopython you can (assuming you have it downloaded and installed)

$ python
> from Bio import AlignIO
> AlignIO.convert("file.clustal", "clustal", "file.nexus", "nexus")
ADD COMMENT
0
Entering edit mode

It doesn't work: ValueError: Need a DNA, RNA or Protein alphabet

ADD REPLY
2
Entering edit mode

try adding from Bio import Alphabet then adding the argument alphabet=Alphabet.generic_dna to AlignIO.convert

ADD REPLY
0
Entering edit mode
9.0 years ago

It is possible to delete the quotes in the headers (Seq_name) of the output nexus file?
I would like to use AlignIO from BioPython for this approach.


Example:

#NEXUS
begin data;
   dimensions ntax=3 nchar=50;
   format datatype=dna missing=? gap=-;
matrix
'Seq_name'           AGGGA
'Seq_name'           AGCGG
'Seq_name'           ACTGG

MrBayes gives me an error because of the quotes.

ADD COMMENT

Login before adding your answer.

Traffic: 2018 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6