Question

Scoring Repetitive Sequences

1

Entering edit mode

13.0 years ago

User 9996 ▴ 840

What are some of the simplest measures of a "repetitive" nucleotide sequence? simple one seems like entropy, relative to a random nucleotide probability distribution, assuming independence of each base, but this seems incorrect - especially since some repetitive sequences are highly structured and have ORFs.

related question: why is it that LINE1 elements, which have protein coding regions, are classified as repetitive elements? seems incorrect, because their sequence is not repetitive -- it can't be very repetitive if it codes for protein.

thanks for your help.

genome sequence sequence repeats repeatmasker • 2.5k views

ADD COMMENT • link updated 13.0 years ago by Sean Davis 27k • written 13.0 years ago by User 9996 ▴ 840

score 5 · Answer 1 · 2011-12-08

5

Entering edit mode

13.0 years ago

Sean Davis 27k

Sequences like LINE elements are repetitive because they repeated in the genome, not (directly) because of an intrinsic property of a single LINE element.

ADD COMMENT • link 13.0 years ago by Sean Davis 27k

0

Entering edit mode

Thanks. Do you know what metrics are used to score sequences that are infact repetitive?

ADD REPLY • link 13.0 years ago by User 9996 ▴ 840

0

Entering edit mode

Take a look here: http://www.repeatmasker.org/papers.html. As you can see, your question does not have a simple answer.

ADD REPLY • link 13.0 years ago by Sean Davis 27k