How to align multiple genomic regions simultaneously
0
0
Entering edit mode
7.3 years ago
liaoyunshi • 0

Many thanks for reading my question.

Recently, I want to align multiple sequences from different genomic regions with the whole genome sequences for subsequent phylogenetic study, but I found none of my known tools can fulfill my requirement. Can any fellows give me some suggestion. Thanks a lot.

For example, let's say that the whole genome consists of 3 regions, A,B,C. I have some complete genome sequences, and some sequences of A regions, some sequences of B regions, also some sequences of C regions. I want to get alignment on all those 4 kinds of sequences simultaneously, in one alignment operation (I do not want to align A,B,C to the whole genome sequences separately). And I have used many multiple alignment tools but found none can do this. So I wonder if anyone can help me solve this question.

Thanks!

alignment phylogenetics • 2.3k views
ADD COMMENT
0
Entering edit mode

I don't really get it. Is the problem that aligner takes one read and maps it? This is how it's done, alignment of each read is independent from other reads

ADD REPLY
0
Entering edit mode

Thanks for you reply. Actually I want to do a multiple sequence alignment of "reference + A + B + C" at the same time.

ADD REPLY
0
Entering edit mode

liaoyunshi : As stated this question is not clear. Are you referring to pair-wise alignments or multiple sequence alignments? Those two are different things. Sounds to me like you want to do a multiple sequence alignment of "reference + A + B + C" at the same time. Is that correct?

ADD REPLY
0
Entering edit mode

Yes, you are right. I want to do a MSA of all sequences at the same time.

ADD REPLY
0
Entering edit mode

Any MSA program should be able to do that. Is the reference very long compared to A/B/C?

ADD REPLY
0
Entering edit mode

Not very long, about 10-fold in length. But I think the problem is that the MSA program will try to align A/B/C in somewhat overlapping columns, while in fact they should be totally separated because they come from different regions of the genome.

ADD REPLY
0
Entering edit mode

If you know that they come from different regions of the genome then why do you want to align them at the same time?

ADD REPLY
0
Entering edit mode

Because I have a huge number of sequences from GenBank. I only know they should be from different regions while I don't exactly know which region each sequence belongs to. So it is hard to separate them at the beginning. Instead, I need to do a huge full alignment at first which can tell me the region of each sequence.

ADD REPLY

Login before adding your answer.

Traffic: 2595 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6