Sequence identity between sequences with different lengths
0
1
Entering edit mode
5.8 years ago

Hello,

A simple question. What is the sequence identity between 2 sequences when one is much larger than the other?

Example:

seq1:  -------------------AGTGTGAAAAAGGT----------------
seq2:  ATATATGCGCATGGTAATAAGTGTGAAAAAGGTTATATGCGCATAAGGT

The smaller sequence corresponds 100% to a subset of the bigger one. Do they have 100% identity? Or rather something like 30%, as seq1 corresponds to 30% of seq2?

The reason why I ask this is that I am filtering an alignment of two assemblies of the same genome (with nucmer/mumer) and I can filter out aligned contigs based on identity.

Thank you,
Ricardo

sequence identity alignment filter mummer • 1.9k views
ADD COMMENT
1
Entering edit mode

Would have say that, if you look at seq1 it has 100% identity on 100% of its length, if you look at seq2 it has 100% identity on 30% of its length, it's a point a view

ADD REPLY
1
Entering edit mode

I would say seq1 is 100% identical to seq2, while seq2 is only 30% identical to seq1 .

unfortunately heavily depending on how you look at this

ADD REPLY
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Great, that's it, thanks! It depends on what is the query and what is the reference. Thanks! (If you write it as an answer instead of a comment I'll accept it)

ADD REPLY
2
Entering edit mode

It also depends on whether you use global or local alignment.

ADD REPLY

Login before adding your answer.

Traffic: 1664 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6