Entering edit mode
4.4 years ago
A_heath
▴
170
HI all,
I have a .txt file containing protein IDs (one per line), for instance:
Lactococcus_1763
Lactococcus_3492
Lactococcus_2391
I would like to subtract and add 20 from each IDs, in order to have something like this:
Lactococcus_1743
Lactococcus_1763
Lactococcus_1783
Lactococcus_3472
Lactococcus_3492
Lactococcus_3512
Lactococcus_2371
Lactococcus_2391
Lactococcus_2411
Does anyone have a suggestion? If so, I would gladly take it!!
Thank you in advance
Have a great day
Thank you so much!! It worked perfectly! It's so helpful ...
I have another similar situation where this time I would like to extract the protein IDs all of the 20 proteins, for instance I have :
Lactococcus_1763
and I would like to have: Lactococcus_1743 ... Lactococcus_1759 Lactococcus_1760 Lactococcus_1761 Lactococcus_1762 Lactococcus_1763 Lactococcus_1764 Lactococcus_1765 Lactococcus_1766 Lactococcus_1767 ... Lactococcus_1783
If you have any ideas, please let me know
Anyways, thank you again so much for your help cschu181 !
You're welcome. Do you want to look up how a while loop in awk works and try to do it yourself? It is just a slight adjustment to the code, but it might not hurt to try.
You're right, I indeed tried as I could to do a while loop:
while read line; do awk -v OFS='_' -v FS='_' -v offset=1 '{print $1,$2-offset; print $0; print $1,$2+offset;}' input_file.txt >>output_file.txt
When setting offset=1 to print all IDs, I don't know how to input a limit of 20 protein IDs.. Can you help me again?
Sure, the while loop needs to be inside the awk "script":
There is probably also a pure bash solution, using the
while
command as you tried, but I find that much too complex.Well now that's perfect! The awk command is very useful.. Thank you so much for all the help you provided. It's greatly appreciated. Have a nice day
Please accept the answer (green check mark) to provide closure to this thread.