You'll need to show us some useful examples of your input data before we can help you very much.
If the problem is as you say, failing because you're trying to make a multifasta from existing multifastas, first concatenate all of the contigs:
$ cat genome1.fasta | sed '1!{/^>.*/d;}' > genome1_concatenated.fasta
$ cat genome2.fasta | sed '1!{/^>.*/d;}' > genome2_concatenated.fasta
.
.
.
$ cat genomeN.fasta | sed '1!{/^>.*/d;}' > genomeN_concatenated.fasta
(You can loop this if you have too many to handle). sed -i '1!{/^>.*/d;}' genome1.fasta
will edit-in-place if you prefer to do that.
If you're interested, what this command is doing is saying:
Ignoring the first occurrence (1!{}
), if the line begins with a ">" (^>
), followed by any number of occurrences of any character (.*
), delete that line (/d
).
Hopefully it's obvious that this means all your sequences will now be under whatever fasta header the first sequence in that fasta had. You can edit this yourself if you want something else.
Then, concatenate the concatenated files:
$ cat *_concatenated.fasta > all_genomes.fasta
And then do your alignments.
A word to the wise though, if you're trying to align whole genomes, clustal and muscle aren't up to the task.
do you already have several genomes in several fasta files? if you do, or once you get them, use
cat
to concatenate the files.I have already done cat to concatenate them but it does not work, I think because each file fasta that I have contains several contigs. If you have other proposal to get a file phylip format of these genomes complete or a method I can get the tree phylogeny. Thank you