Entering edit mode
2.3 years ago
M.
▴
40
I have a text file that contains one single amino acid mutation at each line. I need to detect how many of them occur in the same location and what location it is?
The content of the file looks like this:
D138Y
N450K
D614G
P681R
K1191N
A27V
E484A
S680F
So I want that my code to take the location of the mutation and if it occurs more than once in my file it should give me that specific location. I don't know how to draft numbers from text. Can you help me with that or tell me where to start?
Thanks in advance.
Hi, I am new in programming. Excuse me if I'm wrong but isn't this for Linux?
yes, it is
i think, you're looking for a simple python function code to do this task?
yes
I managed to get the locaitons of the mutations with this:
Now I am trying to say if this specific location occurs more than once in my file give me that lines of the file or the count of the lines maybe
+1 - great solution. Just want to add that, while this will perform as requested at the amino acid level, please keep in mind there may be only one, a few, several, or many nucleotide changes that lead to these current changes.
Is this for secondary mutation testing for protein kinases?
No, it is just about getting the specific locations that had several mutations. It's a mutation analysis you could say. I'm looking for mutations that occurred in some datasets (protein sequences) that are collected from all around the world. Not interested in what nucleotide changes cause the mutations. At least for now :) Thank you for your concern though!