Entering edit mode
3.2 years ago
jbt38
•
0
I have a number of existing multiple sequence nucleotide alignments from closely related taxa (two clades which are sisters), and need to align these alignments for analysis. Some are homologous and some not. I think the best way is to cluster them all together to identify these homologous clusters. I know how to do this for single sequences but not entire alignments.
How about just pooling all the sequences and clustering with e.g.
cd-hit
orvsearch
?Hi, I've ran cd-hit to identify clusters and aligned them, but some are fragment sequences with full counterparts that need to be merged, and there are too many to go manually. So I'm looking to identify the pairs of clusters likely to be homologous and align them
Not sure I fully understand your question but if you want to align two MSAs to each other (or even a sequence to an existing alignment) you should look for profile-to-profile alignment tools. Some that come to mind: t-coffee, muscle, clustal-omega, ...
see also here :Aligning One Protein Sequence With A Multiple Sequence Alignment
Hi, thanks yeah I have mafft in mind for the aligning task but before that I'd like to cluster homologous pairs of alignments, because some are made of fragmentary sequences with full-sequence counterparts.
I see, perhaps you can first run a simple blast and run blastclust or such on the results to get a rough clustering ?