Changing file name with sed command
1
1
Entering edit mode
5.9 years ago

Hi all. I need a little bit of help. I have a list of files in a folder, they look like this:

   H3K4me1/Cell-Line=S2-DRSC#Developmental-Stage=Late-Embryonic-stage#Tissue=Embryo-derived-cell-line/ChIP-chip/Rep-1//Dmel_r5.32/modENCODE_304.gff3

H3K4me2/Cell-Line=S2-DRSC#Developmental-Stage=Late-Embryonic-stage#Tissue=Embryo-derived-cell-line/ChIP-chip/Rep-1//Dmel_r5.32/modENCODE_965.gff3

H3K4me3/Cell-Line=S2-DRSC#Developmental-Stage=Late-Embryonic-stage#Tissue=Embryo-derived-cell-line/ChIP-chip/Rep-1//Dmel_r5.32/modENCODE_3761.gff3

H3K9ac/Cell-Line=S2-DRSC#Developmental-Stage=Late-Embryonic-stage#Tissue=Embryo-derived-cell-line/ChIP-chip/Rep-1//Dmel_r5.32/modENCODE_3765.gff3

I would like to change their name with sed command, so the final name would be like this:

H3K4me1_modENCODE_304.gff3

H3K4me2_modENCODE_965.gff3

H3K4me3_modENCODE_3761.gff3

H3K9ac_modENCODE_3765.gff3

I know i can play with sed s and the / but I don't manage to get it to work. I would extremely appreciate if you explain a bit the code you give as an answer.

Thank you in advance! Best wishes,

Jordi

ChIP-Seq • 1.5k views
ADD COMMENT
1
Entering edit mode

every time you put a '#' or a '=' in a filename, god kills a kitten.

ADD REPLY
0
Entering edit mode

got them named like this from modENCODE page, but thanks for the information, never will dare to do it (poor kittens!)

ADD REPLY
0
Entering edit mode

Just be a little cautious with all the /'s in those names. That'll look like a directory structure to bash.

e.g. touch 'H3K4me1/Cell-Line=S2-DRSC#Developmental-Stage=Late-Embryonic-stage#Tissue=Embryo-derived-cell-line/ChIP-chip/Rep-1//Dmel_r5.32/modENCODE_304.gff3' fails miserably, even when hard quoted.

ADD REPLY
0
Entering edit mode

+1 Pierre

every time you put a '#' or a '=' in a filename, god kills a kitten.

Then we have the culprit for wars and famines: spaces in filenames.

ADD REPLY
6
Entering edit mode
5.9 years ago
Joe 21k

This is all you need in terms of an expression:

sed 's|/.*/|_|gi'

e.g.

$ echo "H3K4me1/Cell-Line=S2-DRSC#Developmental-Stage=Late-Embryonic-stage#Tissue=Embryo-derived-cell-line/ChIP-chip/Rep-1//Dmel_r5.32/modENCODE_304.gff3" | sed 's|/.*/|_|gi'
H3K4me1_modENCODE_304.gff3

Pro tip, use that regular expression with the rename program.

Explained:

s = substitute

| = sed delimiter so as to avoid confusion with the more traditional /

/ = Match the first forward slash you find

.* = followed by any character, any number of times

/ = until you meet another /.

|_| = replace all the previously matched stuff with an underscore.

gi = globally, and case insensitive (you don't strictly need these, I include them as force of habit).

ADD COMMENT
0
Entering edit mode

Thank you! Your answer is very much appreciated

ADD REPLY
0
Entering edit mode

Glad it helped. Be sure to accept it to provide closure to the thread if it resolved your problem.

ADD REPLY

Login before adding your answer.

Traffic: 2767 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6