Hello everybody!! I have some questions to ask: 1.I have to generate random dna sequence, length: 20KB with equal base frequency on python. I tried to use this function:
def dna(length):
DNA = ""
for i in range(length):
DNA += choice('atcg')
return DNA
But it doesn't return equal frequency for all the bases. Is there is any way to do it? (not too complicated...)
2.I have to calculate the frequency of all the bases from a given file. But I'v got a huge file so I have to split it. How can I split the file, send it to a function that calculate frequency (I'v already written it) and return the real frequency?
Thanks!!!
How did you assess that your function didn't return equal frequencies?
What is "huge" in your file? Does it contain one enormous sequence or multiple sequences? How is your function written?
About the second question my file contains only one sequence (human's chromosome) It does'nt matter how my function written. The problem is how to split the file correctly. But anyway this is my function:
Thank you so much!!! Can anybody explain me the second question?
Please use
ADD COMMENT
orADD REPLY
to answer to earlier posts, as such this thread remains logically structured and easy to follow.