How can I detect mutations that occur more than once in a location with PYTHON?
1
0
Entering edit mode
2.3 years ago
M. ▴ 40

I have a text file that contains one single amino acid mutation at each line. I need to detect how many of them occur in the same location and what location it is?

The content of the file looks like this:

D138Y
N450K
D614G
P681R
K1191N
A27V
E484A
S680F

So I want that my code to take the location of the mutation and if it occurs more than once in my file it should give me that specific location. I don't know how to draft numbers from text. Can you help me with that or tell me where to start?

Thanks in advance.

python • 1.7k views
ADD COMMENT
2
Entering edit mode
2.3 years ago
 grep -Eo '[0-9]+' input.txt | sort | uniq -d
ADD COMMENT
0
Entering edit mode

Hi, I am new in programming. Excuse me if I'm wrong but isn't this for Linux?

ADD REPLY
0
Entering edit mode

but isn't this for Linux?

yes, it is

ADD REPLY
0
Entering edit mode

i think, you're looking for a simple python function code to do this task?

ADD REPLY
0
Entering edit mode

yes

ADD REPLY
0
Entering edit mode

I managed to get the locaitons of the mutations with this:

 import re
 with open('myfile.txt', 'r') as file_one:

    for line in file_one:
        SNP=re.search(r"[0-9]+", line)
        location=SNP.group(0)

Now I am trying to say if this specific location occurs more than once in my file give me that lines of the file or the count of the lines maybe

ADD REPLY
0
Entering edit mode

+1 - great solution. Just want to add that, while this will perform as requested at the amino acid level, please keep in mind there may be only one, a few, several, or many nucleotide changes that lead to these current changes.

Is this for secondary mutation testing for protein kinases?

ADD REPLY
0
Entering edit mode

No, it is just about getting the specific locations that had several mutations. It's a mutation analysis you could say. I'm looking for mutations that occurred in some datasets (protein sequences) that are collected from all around the world. Not interested in what nucleotide changes cause the mutations. At least for now :) Thank you for your concern though!

ADD REPLY

Login before adding your answer.

Traffic: 1773 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6