Entering edit mode
8.1 years ago
ashkan
▴
160
I have a text file in which there is column containing IDs. here is a small example of this column:
ENSG00000072803.13
ENSG00000163002.8
ENSG00000102221.9
ENSG00000072121.11
ENSG00000149532.11
ENSG00000134419.11
I want to get rid of point and any number after that. so, for this example I need something like this:
ENSG00000072803
ENSG00000163002
ENSG00000102221
ENSG00000072121
ENSG00000149532
ENSG00000134419
do you guys know how to do that in python?
It would be nice if you would follow up on your earlier questions before opening new threads. See for example a set of guidelines in this post: How To Ask Good Questions On Technical And Scientific Forums and https://www.ncbi.nlm.nih.gov/pubmed/21980280
Hi Ashkan,
and a simple command line :
cut -f1 -d '.' yourfile > yourfile.pointless
~ Best
As usual people don't ask clear questions so if there is only one column then this solution may be fine.
It is not clear from the original question if there is only one column in the file or other things as well.
Hi genomax2, yes you are right (as always).
By the way, does this one vs. several column situation has any impact on @Wouter python script or not?
and I guess the first thing that most of the people try to solve is the "example" have been offered in the question.
My piece of code should work fine regardless of one or multiple column files, but if there is only one column a tab will be appended to each line. Essentially my script will modify the first column of the file and leave the rest of the file as it, regardless of there is a rest.
Dear Wouter, I have used your code in a two column tab separated file (two columns are separate from each other with tab), but it seems that it has just remove the first column point and show the second column original lines!
Am I having any error in running your code?
ENSG00000072803 ENSG00000072803.18
ENSG00000163002 ENSG00000072803.13
ENSG00000163088 ENSG00000072803.33
ENSG00000163054
He mentioned that his script will modify only the first column. If you want to do it for all columns:
Thank you (and WouterDeCoster), that works!
Yes exactly, I assumed his IDs would be in the first column/field (but that can be adapted.) and that removing '.' in the rest of the file wasn't desirable. So the removing only from the first column is a feature, not a bug ;) But as Goutham Atla demonstrates it can be done easily.
It may.
That is why the mods are fighting this battle of making sure people realize that they need to ask clear questions (and/or do a minimal search for a solution before posting a new question).