How to cluster existing multiple sequence alignments to identify homologous clusters
0
0
Entering edit mode
3.1 years ago
jbt38 • 0

I have a number of existing multiple sequence nucleotide alignments from closely related taxa (two clades which are sisters), and need to align these alignments for analysis. Some are homologous and some not. I think the best way is to cluster them all together to identify these homologous clusters. I know how to do this for single sequences but not entire alignments.

clustering sequence fasta multiple alignment nucleotide • 2.1k views
ADD COMMENT
0
Entering edit mode

How about just pooling all the sequences and clustering with e.g. cd-hit or vsearch?

ADD REPLY
0
Entering edit mode

Hi, I've ran cd-hit to identify clusters and aligned them, but some are fragment sequences with full counterparts that need to be merged, and there are too many to go manually. So I'm looking to identify the pairs of clusters likely to be homologous and align them

ADD REPLY
0
Entering edit mode

Not sure I fully understand your question but if you want to align two MSAs to each other (or even a sequence to an existing alignment) you should look for profile-to-profile alignment tools. Some that come to mind: t-coffee, muscle, clustal-omega, ...

see also here :Aligning One Protein Sequence With A Multiple Sequence Alignment

ADD REPLY
0
Entering edit mode

Hi, thanks yeah I have mafft in mind for the aligning task but before that I'd like to cluster homologous pairs of alignments, because some are made of fragmentary sequences with full-sequence counterparts.

ADD REPLY
0
Entering edit mode

I see, perhaps you can first run a simple blast and run blastclust or such on the results to get a rough clustering ?

ADD REPLY

Login before adding your answer.

Traffic: 1752 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6