Hi guys, I want to take only specific value from dataframe in python
For example:
MONDO:MONDO:0014405,MedGen:C4014722,OMIM:615934,Orphanet:ORPHA425120|MedGen:CN169374
I want to only OMIM:615934 part from this row.
This is my first attempt:
for i in my_data.ClinVar_CLNDISDB:
if "OMIM:" in i:
select = re.compile(r'^(.+?),')
print(select.findall(str(i)))
But the output gives everything until "first" comma.
like that : MONDO:MONDO:0014405
How can I change my code to reach my aim? Thank you!
Why do you think the regex
'^(.+?),'
only gives you that result?What do you suppose you may need to change about it? More generally, think about what the distinguishing characteristics of the string and its immediate environment are that would allow you to extract it.
Are you certain you even need to use a regex here? Your string is comma-delimited, in particular around the field you want, so there will be simpler ways to achieve what you want.