Hi everyone
I have a list of id in which many are redundant (not similar) like I have following id
Traes_1DL_9344DD117.1
Traes_1DL_9344DD117.2
Traes_1DL_9344DD117.3
Traes_1DL_9344DD117.4
Traes_1DL_BDDF1198A.1
Traes_1DL_BDDF1198A.2
Traes_1DL_BDDF1198A.3
Traes_1DL_BDDF1198A.4
Traes_1DL_BDDF1198A.5
Traes_1DL_C91636A55.1
Traes_1DL_C91636A55.2
Traes_1DL_C91636A55.3
Traes_1DL_C91636A55.4
Traes_1DL_C91636A55.5
Traes_1DL_C91636A55.7
Traes_1BL_43408C9B0.2
Traes_2AL_48B239DD6.1
Traes_2AL_48B239DD6.2
Traes_2AL_48B239DD6.3
Traes_2AL_48B239DD6.4
Traes_2AL_C00552444.1
Traes_2AL_C00552444.2
Traes_2AL_C00552444.3
Traes_2AS_146D16572.1
Traes_2AS_146D16572.2
Traes_2AS_146D16572.3
Traes_2AS_146D16572.4
Traes_2AS_146D16572.5
I am assuming that the id having similar words before point (.
) but differ later like
Traes_2AS_146D16572.1
Traes_2AS_146D16572.2
Traes_2AS_146D16572.3
Traes_2AS_146D16572.4
Traes_2AS_146D16572.5
having only one significant id i.e Traes_2AS_146D16572.1
, and I want to retain that only from rest. So how I can do it through programming?
In short, I want to extract only id's which have .1
in last from all among. And if .1
is not present but .2
or .3
or so on is present and they are unique means initial id is different then I also want to print them. like Traes_1BL_43408C9B0.2
is only present and I want to print them.
Thank you
the answer is
what if there is only version .2 of the entry in the input?
he said "i want to extract only id's which have .1 in last from all among. "
Thanks nterhoeven,
I have put this as a separate question, but the moderator closed it telling me to search not put as a new question.
The grep gives a pattern,so for 10,000 or so I cannot do grep always for individual, that's y I was searching for awk command as I mentioned earlier.
Any other suggestions are welcome...
thanks a lot
RV
you can use the
--file
option with grep to search for multiple pattern, but this can be quite slow, if your fasta file is large.A faster solution would be a perl script using a hash and the "exists" function.
BTW: you should use answers only if you are answering the question on the top. Other stuff (discussions, etc) should be done in comments
Ok Thanks nterhoeven |