Do your sequence description lines have spaces in them (have extra text after what's treated as the sequence ID, in other words?)
This works fine for me with sequence IDs only, but when there are spaces I see that same error. I'm not very familiar with the PHYLIP format but it seems very cranky about whitespace.
One example.fasta
that works:
>seq1
AACAAGGAAAGAATCGAACTCCAAAACTGACAAGAGCTGTGACAGGGACTAGGTCCGGATCTTAGGACGCCTTAAACTGGTGTACTATGCTGTCGTCTCT
>seq2
AACAATGAAAGAATCGAACTCCAAAACTGACAAGAGCTGTGACAGGGACTAGGTCCGGATCTTAGGACGCCTTGAACTGGTGTACTATGCTGTCGTCTCT
>seq3
AACAAGGAAAGAATCGAACTCCAAAACTGACAAGAGTGACAGGGACTAGGTCCGGATCTTAGGACGCCTTAAACTGGTGTACTATGCTGTCGTCTCT
>seq4
AACAATGAACGAATCGAACTCCAAAGCTGACAAGAGCTGTGACAGGTACTAGGTGCGGATCTTAGCACGCCTTGAACTGGTGTACTATGCTGTCATCTCT
Which gives this example.msa.fasta.phylip
:
4 100
seq4 AACAATGAAC GAATCGAACT CCAAAGCTGA CAAGAGCTGT GACAGGTACT
seq2 AACAATGAAA GAATCGAACT CCAAAACTGA CAAGAGCTGT GACAGGGACT
seq1 AACAAGGAAA GAATCGAACT CCAAAACTGA CAAGAGCTGT GACAGGGACT
seq3 AACAAGGAAA GAATCGAACT CCAAAACTGA CAAGA---GT GACAGGGACT
AGGTGCGGAT CTTAGCACGC CTTGAACTGG TGTACTATGC TGTCATCTCT
AGGTCCGGAT CTTAGGACGC CTTGAACTGG TGTACTATGC TGTCGTCTCT
AGGTCCGGAT CTTAGGACGC CTTAAACTGG TGTACTATGC TGTCGTCTCT
AGGTCCGGAT CTTAGGACGC CTTAAACTGG TGTACTATGC TGTCGTCTCT
This example.fasta
does not work:
>seq1
AACAAGGAAAGAATCGAACTCCAAAACTGACAAGAGCTGTGACAGGGACTAGGTCCGGATCTTAGGACGCCTTAAACTGGTGTACTATGCTGTCGTCTCT
>seq2
AACAATGAAAGAATCGAACTCCAAAACTGACAAGAGCTGTGACAGGGACTAGGTCCGGATCTTAGGACGCCTTGAACTGGTGTACTATGCTGTCGTCTCT
>seq3
AACAAGGAAAGAATCGAACTCCAAAACTGACAAGAGTGACAGGGACTAGGTCCGGATCTTAGGACGCCTTAAACTGGTGTACTATGCTGTCGTCTCT
>seq4 with extra stuff
AACAATGAACGAATCGAACTCCAAAGCTGACAAGAGCTGTGACAGGTACTAGGTGCGGATCTTAGCACGCCTTGAACTGGTGTACTATGCTGTCATCTCT
The example.msa.fasta.phylip
then is:
4 100
seq4 with AACAATGAAC GAATCGAACT CCAAAGCTGA CAAGAGCTGT GACAGGTACT
seq2 AACAATGAAA GAATCGAACT CCAAAACTGA CAAGAGCTGT GACAGGGACT
seq1 AACAAGGAAA GAATCGAACT CCAAAACTGA CAAGAGCTGT GACAGGGACT
seq3 AACAAGGAAA GAATCGAACT CCAAAACTGA CAAGA---GT GACAGGGACT
AGGTGCGGAT CTTAGCACGC CTTGAACTGG TGTACTATGC TGTCATCTCT
AGGTCCGGAT CTTAGGACGC CTTGAACTGG TGTACTATGC TGTCGTCTCT
AGGTCCGGAT CTTAGGACGC CTTAAACTGG TGTACTATGC TGTCGTCTCT
AGGTCCGGAT CTTAGGACGC CTTAAACTGG TGTACTATGC TGTCGTCTCT
phyml gives me a similar error:
Check sequence 'seq4' length (expected length: 100, observed length: 104) [OTU 1].
I think this fits what you're seeing, where in the original input you have M55008.1
(8 characters) a space, and then more text, and it gets truncated in the .phylip to just one extra character (ten total) that confuses phyml.
Thanks very much! Work like it should