Solved: I stopped using clustalo and went back to mafft (originally what I wanted to use) but it was unable to read my alignments. This was due to non-typical characters that were inserted in my exon alignments ("!" and "?"), but after converting those into dashes, mafft read the alignment properly and I was able to append my new species while maintaining the reading frame of my original MSA. For those who may need to do the same, I used (windows, cygwin):
$ mafft --addfull outgroup_species.fasta --keeplength prealigned_msa.fasta > combined_msa.fasta
I need to align a new sequence to a pre-existing multiple sequence alignment. I know how to run clustalo profile-profile alignment where I treat my one new sequence as a separate alignment. But everytime I run this process, the pre-existing MSA gets gaps added between columns but I need to avoid this as it is ruining my reading frame.
Is there an option to simply not alter the first profile alignment at all?
Sample of my pre-aligned MSA (if I were looking at the first 4 exons):
>sp1
----ATGCTC---ATAT
>sp2
----ATGGTC---ATAT
>sp3
CCAT---------ATAT # These gaps are inserted to represent a missing exon
>sp4
CCATATGGTCCCC---- # Gaps needed to maintain the reading frame per exon
The sequence I want to add to the pre-aligned MSA (it has some extra bases that I show with () that need to be trimmed after aligned; all exons included as this is a reference sequence):
>outgroup
CCATAT(T)GGTCCCCATAT(TCA)
Ideal output:
>sp1
----ATGCTC---ATAT
>sp2
----ATGGTC---ATAT
>sp3
CCAT---------ATAT
>sp4
CCATATGGTCCCC----
>outgroup
CCATATGGTCCCCATAT
Not sure how to align the new sequence while maintaining the length of the MSA, because if it does add columns it will mess up the reading frame. I cannot convert the bases to amino acids either because I will have to work in nucleotides for future dN/dS ratios.
Why have you asked the same question twice?
Desperate times. My original question led me to finding the clustalo answer, which still causes some problems for me. I deleted it anyways.