Replace two or more space by a tab with terminal
1
0
Entering edit mode
6.5 years ago

Hi all, I get an issue when I want to replace multiple space by a tab in a file (in other word convert the file in a table with separator to easily import in excel). Here is the file (Basically, there are 5 columns separated by two or more spaces):

lcl|tetur19g00650  length:863 (mRNA) (E75)  (Ecdysone-induced pro...   132    2e-32
lcl|tetur03g08440  length:544 (mRNA) (HR3)  (Hormone Receptor 3)      85.1    5e-18
lcl|tetur04g01460  length:396 (mRNA) (SVP)  (Seven Up)                66.6    1e-12
lcl|tetur07g00140  length:1063 (mRNA) (HR4)  (Hormone Receptor 4)     63.5    1e-11
lcl|tetur10g04690  length:854 (mRNA) (HR38 (1))  (Hormone Recepto...  61.6    5e-11
lcl|tetur08g06490  length:645 (mRNA) (FTZ-F1)  (Fushi tarazu - Fa...  61.2    6e-11
lcl|tetur01g11050  length:726 (mRNA) (n/a)  (Zinc finger, nuclear...  60.8    9e-11
lcl|tetur01g11040  length:726 (mRNA) (NR2E3)  (Zinc finger, nucle...  60.8    9e-11
lcl|tetur01g09240  length:431 (mRNA) (RXR (2))  (Retinoid X Recep...  60.1    1e-10
lcl|tetur10g04710  length:867 (mRNA) (HR38 (2))  (Hormone Recepto...  59.7    2e-10
lcl|tetur34g00430  length:561 (mRNA) (kni)  (Hypothetical knrl) (...  57.4    1e-09
lcl|tetur03g02550  length:497 (mRNA) (PNR-like)  (Photocell recep...  57.0    1e-09
lcl|tetur31g01930  length:497 (mRNA) (RXR (1))  (Retinoid X Recep...  56.6    1e-09
lcl|tetur05g04280  length:499 (mRNA) (HNF4)  (Hepatocyte Nuclear ...  56.6    2e-09
lcl|tetur01g07700  length:338 (mRNA) (HR83-like)  (Possibly HR83-...  52.0    4e-08
lcl|tetur01g02690  length:576 (mRNA) (dissatisfaction)  (dissatis...  51.6    6e-08
lcl|tetur08g01210  length:249 (mRNA) (Tll)  (Tailless)                51.2    7e-08
lcl|tetur11g04570  length:901 (mRNA) (HR39)  (Hormone receptor-li...  47.4    9e-07
lcl|tetur28g00490  length:370 (mRNA) (ERR)  (Estrogen-related Rec...  47.0    1e-06
lcl|tetur11g01960  length:567 (mRNA) (HR96-like g)  (HR96-like nu...  45.8    2e-06
lcl|tetur01g15140  length:430 (mRNA) (EcR)  (Ecdysone Receptor)       44.7    6e-06
lcl|tetur01g07820  length:564 (mRNA) (HR96-like d)  (HR96-like nu...  42.7    2e-05
lcl|tetur34g00750  length:579 (mRNA) (HR96-like a)  (HR96-like nu...  42.7    3e-05
lcl|tetur30g01210  length:669 (mRNA) (HR96-like b)  (HR96-like nu...  39.3    2e-04
lcl|tetur36g00260  length:501 (mRNA) (HR96-like h)  (HR96-like nu...  39.3    3e-04
lcl|tetur20g01820  length:499 (mRNA) (HR96-like e)  (HR96-like nu...  38.9    3e-04
lcl|tetur04g03100  length:490 (mRNA) (HR96-like f)  (HR96-like nu...  38.9    3e-04
lcl|tetur17g03630  length:483 (mRNA) (HR96-like c)  (HR96-like nu...  38.5    4e-04
lcl|tetur07g04810  length:311 (mRNA) (E78)  (Ecdysone-induced pro...  33.9    0.012

I use this command line but nothing happens (I got a new file but with the same space-separator):

sed 's/ \+ /\t/g' inputfile > outputfile

Do you have some idea? Thank you very much!

unix command terminal • 1.6k views
ADD COMMENT
0
Entering edit mode

I added (code) markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY
0
Entering edit mode

Thanks! It's much better!

ADD REPLY
0
Entering edit mode

for better readability , use sed 's/\s\s\+/\t/g' input (for two or more spaces). However in one of the columns, i see text being separated by spaces. Make sure that you have uniform space between columns, not within column.

ADD REPLY
0
Entering edit mode

Thank you for your reply, actually when there are 2 or more space I would like to replace them by a tab. But when there is only 1 space let it like that. I hope I'm clear

ADD REPLY
0
Entering edit mode
6.5 years ago

You're running that sed command incorrectly for what you want to do. It should just be:

sed 's/ \+/\t/g' inputfile > outputfile
ADD COMMENT
0
Entering edit mode

Thank you, It still not working I don't know why... I'm working on mac but i'm not sure this is an issue?

ADD REPLY
0
Entering edit mode

On Mac, it would just be this:

sed -E $'s/[[:blank:]]+/\t/g' inputfile

However, I see your issue. Even the spaces in the gene descriptions will change.

Are you sure that using a rule whereby only 2 or more spaces combined are changed is valid?

ADD REPLY

Login before adding your answer.

Traffic: 1321 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6