I'd like to create a dictionary of dictionaries for a series of mutated DNA strands, with each dictionary demonstrating the original base as well as the base it has mutated to.
To elaborate, what I would like to do is create a generator that allows one to input a specific DNA strand and have it crank out 100 randomly generated strands that have a mutation frequency of 0.66% (this applies to each base, and each base can mutate to any other base). Then, what I would like to do is create a series of dictionary, where each dictionary details the mutations that occured in a specific randomly generated strand. I'd like the keys to be the original base, and the values to be the new mutated base. Is there a straightforward way of doing this? So far, I've been experimenting with a loop that looks like this:
#yields a strand with an A-T mutation frequency of 0.066%
def mutate(string, mutation, threshold):
dna = list(string)
for index, char in enumerate(dna):
if char in mutation:
if random.random() < threshold:
dna[index] = mutation[char]
return ''.join(dna)
dna = "ATGTCGTACGTTTGACGTAGAG"
print("DNA first:", dna)
newDNA = mutate(dna, {"A": "T"}, 0.0066)
print("DNA now:", newDNA)
But I can only yield one strand with this code, and it only focuses on T-->A mutations. I'm also not sure how to tie the dictionary into this. could someone show me a better way of doing this? Thanks.
Hello abenabba!
It appears that your post has been cross-posted to another site: http://stackoverflow.com/questions/24050294
This is typically not recommended as it runs the risk of annoying people in both communities.
It might help if you can explain your over-all goal. The way you are doing it at the moment you'll only ever be able to mutate a given base to one other base (dictionaries can only take one value per key, although it can be a container). Likewise, the mutation dictionary you want to create for each sequence will ever contain key-value pairs from the dictionary you provide to "mutate"
If you can spell out your over all goals a little more clearly users might be able to suggest a different approach.