"out" parameter for _Fasttree module in Bio.Phylo.Applications from Biopython 1.68 doesn't work
0
2
Entering edit mode
7.3 years ago

Hi,

When I try to run the FastTree wrapper from Bio.Phylo.Applications, it fails, giving me this message:

ApplicationError: Non-zero return code 1 from '/home/fetz/genome/phylosift_v1.0.1/bin/FastTree -nt -gtr -out 245_hypothetical_protein.tre 245_hypothetical_protein.codon', message 'Unknown or incorrect use of option -out'

My Biopython version is 1.68:

import Bio
Bio.__version__
'1.68'

My code, run in ipython3, is:

from Bio.Phylo.Applications import _Fasttree
fasttree_exe = r"/home/fetz/genome/phylosift_v1.0.1/bin/FastTree"
codonfile = "245_hypothetical_protein.codon"
outfile = codonfile.replace(".codon",".tre")
cmd = _Fasttree.FastTreeCommandline(fasttree_exe, nt = True, gtr = True, input = codonfile, out = outfile)
cmd()

The input file, 'codonfile,' is a verified codon alignment from the Bio.codonalign module that has been written to a file. I cannot find any errors in my command construction according to the example given in the API. Can anyone suggest what I am doing wrong? Thank you in advance.

biopython fasttree • 2.4k views
ADD COMMENT
0
Entering edit mode

Taken from FastTree docs:

By default FastTree expects protein alignments, use -nt for nucleotides

and

-gtr -- generalized time-reversible model (nucleotide alignments only)

It seems that both options nt and gtr are for nucleotide alignments and your file is a codon alignment.

Try cmd = _Fasttree.FastTreeCommandline(fasttree_exe, input = codonfile, out = outfile) to see if it works.

[EDIT] This is is based only on the assumption that your codon alignment is not a nucleotide alignment file.

ADD REPLY
0
Entering edit mode

Hi Rodrigo. Actually, my codon alignment is in nucleotides, not amino acids. In a codon alignment, you use a protein alignment to constrain the cognate nucleotide alignment. So I'm sure that the -gtr and -nt options are fine. Here is the head of the codon alignment file:

>ID:AMR59844.1 <unknown description>
ATGTTTCACCGTCCTGGGTTTTCAGCTTTAAACACCGATGTCTGTTGGGCTGAGTACGAG
CGGGTGAAGGAGTTCTTACCCGTAAATCCAAAACACATCAACGTGGGTACTATCGGGCGT
GTTGCGTTCGATGATACTCCGCTGAGTACGTGTATTAAAGCTGCGCTTGGTACGCTGCCG
GCGTCTATAGTGTTTGAG---------GAATTAAAA
>ID:PhiSPFM1_216 <unknown description>
ATGTTTCACCGTCCTGGGTTTTCAGCTTTAAACACCGATGTCTGTTGGGCTGAGTACGAG
CGGGTGAAGGAGTTCTTACCCGTAAATCCTAAACACATCAACGTGGGTACTATCGGGCGT
GTTGCGTTCGATGATACTCCGCTGAGTACGTGTATTAAAGCAGCGCTTGGTACGCTGCCT
GAGCCTCACCACCACGACTGGGAGGCAACTTTACCC

I think it might be a bug in the Biopython wrapper, since the FastTree (version 2.1.3 SSE3) options don't include an "out" option.

ADD REPLY
0
Entering edit mode

Ok I see thanks for the info. It actually has an option -out:

FastTree -out tree protein_alignment

Where tree protein is your .tree. file and the protein_alignment your .fasta (in your case a .codon file). Have you tried running FastTree -gtr -n -out 245_hypothetical_protein.codon 245_hypothetical_protein.tree?

ADD REPLY
0
Entering edit mode

Hmmm. It definitely doesn't have an '-out' option in my version of FastTree (2.1.3 SSE3). For example, if I try:

FastTree -nt -gtr test.codon

Works fine and yields a tree in stdout:

FastTree Version 2.1.3 SSE3
Alignment: test.codon
Nucleotide distances: Jukes-Cantor Joins: balanced Support: SH-like 1000
Search: Normal +NNI +SPR (2 rounds range 10) +ML-NNI opt-each=1
TopHits: 1.00*sqrtN close=default refresh=0.80
ML Model: Generalized Time-Reversible, CAT approximation with 20 rate categories
Initial topology in 0.00 seconds
Refining topology: 4 rounds ME-NNIs, 2 rounds ME-SPRs, 2 rounds ML-NNIs
Total branch-length 0.109 after 0.00 sec
ML-NNI round 1: LogLk = -390.459 NNIs 0 max delta 0.00 Time 0.00
Turning off heuristics for final round of ML NNIs
GTR Frequencies: 0.2225 0.2319 0.2693 0.2763
GTR rates(ac ag at cg ct gt) 11.5787 1.2121 8.1377 3.9478 3.5543 1.0000
Switched to using 20 rate categories (CAT approximation)
Rate categories were divided by 0.649 so that average rate = 1.0
CAT-based log-likelihoods may not be comparable across runs
Use -gamma for approximate but comparable Gamma(20) log-likelihoods
ML-NNI round 2: LogLk = -377.465 NNIs 0 max delta 0.00 Time 0.01
Turning off heuristics for final round of ML NNIs (converged)
Optimize all lengths: LogLk = -377.465 Time 0.01
Total time: 0.02 seconds Unique: 2/23 Bad splits: 0/0
((0:0.0,22:0.0):0.05643,(1:0.0,2:0.0,3:0.0,4:0.0,5:0.0,6:0.0,7:0.0,8:0.0,9:0.0,10:0.0,11:0.0,12:0.0,13:0.0,14:0.0,15:0.0,16:0.0,17:0.0,18:0.0,19:0.0,20:0.0,21:0.0):0.05643);

However, if I add the '-out' option to the same command:

FastTree -nt -gtr test.codon -out test.out

I get no tree and a lecture on how to use FastTree:

  FastTree protein_alignment > tree
  FastTree -nt nucleotide_alignment > tree
  FastTree -nt -gtr < nucleotide_alignment > tree
FastTree accepts alignments in fasta or phylip interleaved formats

Common options (must be before the alignment file):
  -quiet to suppress reporting information
  -nopr to suppress progress indicator
  -log logfile -- save intermediate trees, settings, and model details
  -fastest -- speed up the neighbor joining phase & reduce memory usage
        (recommended for >50,000 sequences)
  -n <number> to analyze multiple alignments (phylip format only)
        (use for global bootstrap, with seqboot and CompareToBootstrap.pl)
  -nosupport to not compute support values
  -intree newick_file to set the starting tree(s)
  -intree1 newick_file to use this starting tree for all the alignments
        (for faster global bootstrap on huge alignments)
  -pseudo to use pseudocounts (recommended for highly gapped sequences)
  -gtr -- generalized time-reversible model (nucleotide alignments only)
  -noml to turn off maximum-likelihood
  -nome to turn off minimum-evolution NNIs and SPRs
        (recommended if running additional ML NNIs with -intree)
  -nome -mllen with -intree to optimize branch lengths for a fixed topology
  -cat # to specify the number of rate categories of sites (default 20)
      or -nocat to use constant rates
  -gamma -- after optimizing the tree under the CAT approximation,
      rescale the lengths to optimize the Gamma20 likelihood
  -constraints constraintAlignment to constrain the topology search
       constraintAlignment should have 1s or 0s to indicates splits
  -expert -- see more options
For more information, see http://www.microbesonline.org/fasttree/

I think it might be down to my version of FastTree or a bug in the wrapper.

ADD REPLY
2
Entering edit mode

The -out version is available for the version 2.1.10 so maybe is a matter of updating your FastTree version. Also the correct way would be to type in the command line FastTree -nt -gtr -out test.out test.codon.

ADD REPLY
0
Entering edit mode

Oops! Yes, you're right; I wrote the command wrong. Regardless, after updating to FastTree 2.1.10, the Biopython wrapper works! It was my ancient version of FastTree that was causing me trouble. Thank you for your help, Rodrigo!

ADD REPLY
0
Entering edit mode

I know this is an old post, but do you happen to know if it is possible for input to take a file handle rather than a filename?

ADD REPLY

Login before adding your answer.

Traffic: 2602 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6