What are some of the simplest measures of a "repetitive" nucleotide sequence? simple one seems like entropy, relative to a random nucleotide probability distribution, assuming independence of each base, but this seems incorrect - especially since some repetitive sequences are highly structured and have ORFs.
related question: why is it that LINE1 elements, which have protein coding regions, are classified as repetitive elements? seems incorrect, because their sequence is not repetitive -- it can't be very repetitive if it codes for protein.
thanks for your help.
Thanks. Do you know what metrics are used to score sequences that are infact repetitive?
Take a look here: http://www.repeatmasker.org/papers.html. As you can see, your question does not have a simple answer.