Quantifying Sequence Divergence In An Alignment
2
1
Entering edit mode
12.7 years ago

Hi all, I am dealing with multiple sequence alignments (nucleotide sequence, hundreds of them) at the moment. Essentially, I have two situations:

1) The sequences in the alignment are divergent while there is conservation at a particular heptameric stretch of nucleotides that I am interested in- In such case, I can simply make a weblogo and show that this heptameric site is conserved while the rest of the region in the sequence alignment is not. 2) The other situation is when the sequence divergence is little and sequences are almost identical. In such case, even though the heptameric site is conserved, it does not really prove anything since everything else is conserved too.

I am looking for a numerical parameter that can help me to quantify the sequence divergence in a multiple sequence alignment. This way, I can rank my alignments and those with little sequence divergence can simply be discarded. I can focus on the rest then.

Thanks

sequence • 3.3k views
ADD COMMENT
2
Entering edit mode
12.7 years ago
Ari ▴ 120

Probably the best measure for this is the total tree length, that is the sum of branch lengths of a phylogenetic tree. Unless you know the phylogeny in advance (and thus can just estimate the branch lengths), accurate methods to compute this may be too slow. For your needs, it should fine to use a simple distance-based method to compute the tree and then get the tree length from that.

ADD COMMENT
1
Entering edit mode
12.7 years ago
ALchEmiXt ★ 1.9k

Maybe I misunderstand but basically what you need to calculate is the distance matrix between each pair of the sequences as is done for phylogenetic studies. The alignment programs you use should be able to give you those values.

ADD COMMENT

Login before adding your answer.

Traffic: 2484 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6