Entering edit mode
8.9 years ago
I have a list of kmers. I want to make a function that take kmers with same lend (5) and generate a PFM. The list is like ["TTTTA","TTTGA"] . I mean, record how many A/T/C/G you see from the input list. So far , I have this:
def make_pfm(sites, k=5):
pmf1= []
pfm = {"A":[1]*k, "T":[1]*k, "C":[1]*k, "G":[1]*k}
return pmf1
The output that I'm looking for is like:
{'A': [1, 1, 1, 1, 2], 'C': [1, 2, 1, 1, 1], 'T': [2, 1, 2, 1, 1], 'G': [1, 1, 1, 2, 1]}
I will appreciate very much any help. Thanks!
This sounds a lot like homework...
Also what is PFM? Im going to guess that my Google result of "Pelvic Floor Muscle" is not accurate.
I tried the code bellow, but the output don't make sense , I see : {'A': [1, 1, 1, 1, 1, 1, 1], 'C': [1, 1, 1, 1, 1, 1, 1], 'T': [1, 1, 1, 1, 1, 1, 1], 'G': [1, 1, 1, 1, 1, 1, 1], (6482, 4427, 4626, 6087): 'AGTCCG\n'} PFM is just for call the python dictionary , pelvic floor muscle lol.
What do you think that is wrong? Thanks!
def make_pfm(list_of_kmers, k=5):
pfm = {"A":[1]k, "T":[1]k, "C":[1]k, "G":[1]k}
for x in list_of_kmers:
A = sum([x.count('A') for x in list_of_kmers])
C = sum([x.count('C') for x in list_of_kmers])
G = sum([x.count('G') for x in list_of_kmers])
T = sum([x.count('T') for x in list_of_kmers])
pfm[A, C, G, T] = x
return pfm
print make_pfm(list_of_kmers, k=7)