How to subtract a number from protein IDs?
1
0
Entering edit mode
4.4 years ago
A_heath ▴ 170

HI all,

I have a .txt file containing protein IDs (one per line), for instance:

Lactococcus_1763
Lactococcus_3492 
Lactococcus_2391

I would like to subtract and add 20 from each IDs, in order to have something like this:

Lactococcus_1743
Lactococcus_1763
Lactococcus_1783
Lactococcus_3472
Lactococcus_3492
Lactococcus_3512
Lactococcus_2371
Lactococcus_2391
Lactococcus_2411

Does anyone have a suggestion? If so, I would gladly take it!!

Thank you in advance

Have a great day

blast • 1.1k views
ADD COMMENT
2
Entering edit mode
4.4 years ago
cschu181 ★ 2.8k
awk -v OFS='_' -v FS='_' -v offset=20 '{print $1,$2-offset; print $0; print $1,$2+offset;}' your_file.txt
ADD COMMENT
0
Entering edit mode

Thank you so much!! It worked perfectly! It's so helpful ...

I have another similar situation where this time I would like to extract the protein IDs all of the 20 proteins, for instance I have :

Lactococcus_1763

and I would like to have: Lactococcus_1743 ... Lactococcus_1759 Lactococcus_1760 Lactococcus_1761 Lactococcus_1762 Lactococcus_1763 Lactococcus_1764 Lactococcus_1765 Lactococcus_1766 Lactococcus_1767 ... Lactococcus_1783

If you have any ideas, please let me know

Anyways, thank you again so much for your help cschu181 !

ADD REPLY
1
Entering edit mode

You're welcome. Do you want to look up how a while loop in awk works and try to do it yourself? It is just a slight adjustment to the code, but it might not hurt to try.

ADD REPLY
0
Entering edit mode

You're right, I indeed tried as I could to do a while loop:

while read line; do awk -v OFS='_' -v FS='_' -v offset=1 '{print $1,$2-offset; print $0; print $1,$2+offset;}' input_file.txt >>output_file.txt

When setting offset=1 to print all IDs, I don't know how to input a limit of 20 protein IDs.. Can you help me again?

ADD REPLY
1
Entering edit mode

Sure, the while loop needs to be inside the awk "script":

awk -v OFS='_' -v FS='_' -v offset=20 '{i=$2-offset; end=$2+offset; while (i<=end) { print $1,i; i++; }; }' your_file.txt

There is probably also a pure bash solution, using the while command as you tried, but I find that much too complex.

ADD REPLY
0
Entering edit mode

Well now that's perfect! The awk command is very useful.. Thank you so much for all the help you provided. It's greatly appreciated. Have a nice day

ADD REPLY
0
Entering edit mode

Please accept the answer (green check mark) to provide closure to this thread.

ADD REPLY

Login before adding your answer.

Traffic: 2277 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6