Hello,
Could someone please explain this paragraph from from McHardy et al., 2007:
Compositional sequence patterns.
For ompositional feature analysis, we map a given piece of DNA sequence to a higher-dimensional space of nucleotide patterns o = {o1, o2, ..., oq}, where o is defined by the pattern length w and the number of literals l. In this space, s is represented by the compositional input vector v = (a1, a2, ..., aq); where ai is the frequency of pattern oi in s. Input vectors are normalized by the total number of patterns for each sequence.
I specifically want to understand how they would generate o from a given pattern length w and number of literals l. How would this be applied to an example DNA sequence?