Shannon Entropy for DNA
1
0
Entering edit mode
5.9 years ago
markic • 0

I have genome and patterns with 2, 4, 8, 16 lengths. I want to calculate entropy of each pattern in genome? How to calculate this? If you have some advice pls put it here and is it right way to calculate entropy for each pattern?

dna entropy • 4.6k views
ADD COMMENT
0
Entering edit mode

You need to provide us with more detail. An example of your input would help. I have no idea what you mean by “patterns with 2, 4, 8, 16 lengths”.

In the mean time you could try https://github.com/jrjhealey/bioinfo-tools/blob/master/Shannon.py

ADD REPLY
0
Entering edit mode

Hi,

My patterns are for example: AC, GGCC, GAAAGGCG, GGACTAAATCCAGTTT or some random ... I have 10 patterns by each length.

I have ecoli genome in siple text format (not fasta): AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATATCA ....

Now i want to find entropy for each pattern in this genome.

Second question is hot to interpret that values?

ADD REPLY
2
Entering edit mode
5.9 years ago

For shannon entropy, see the first answer to both these threads:

https://biology.stackexchange.com/questions/64368/how-to-determine-the-height-bits-in-a-sequence-logo
and
Shannon entropy of a DNA motif?

Presumably, you have a motif (pattern) in a position weight matrix format? The motif has it's own entropy, regardless of any genome; it's just a measure of how much information is encoded in a motif.

Motif logo

Note the "bits" on the left of the logo; that's your information.

Or were you thinking of something different relative to a genome? E.g. enrichment of the motif in a genome.

ADD COMMENT

Login before adding your answer.

Traffic: 1052 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6