Question

sed or awk command

0

Entering edit mode

5.7 years ago

harry ▴ 40

ENST00000448914.1   13  4.28456     0       0
ENST00000415118.1   8   3.52171     0       0

how to remove the (.*) from column 1 and it looks like

ENST00000448914 13  4.28456     0       0
ENST00000415118     8   3.52171     0       0

please tell me the sed command or awk command to remove it only .

RNA-Seq • 1.5k views

ADD COMMENT • link updated 5.7 years ago by AK ★ 2.2k • written 5.7 years ago by harry ▴ 40

score 2 · Answer 1 · 2019-07-13

2

Entering edit mode

5.7 years ago

AK ★ 2.2k

Hi harry,

By awk:

awk 'BEGIN{OFS="\t"} {gsub("\\.[0-9]+$", "", $1); print}'

(updated) For sed you can try:

sed -r 's/\.[0-9]+\t/\t/'

ADD COMMENT • link 5.7 years ago by AK ★ 2.2k

score 0 · Answer 2 · 2019-07-13

0

Entering edit mode

5.7 years ago

lakhujanivijay 5.9k

Hi harry

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

You could try sed like this

sed 's/\.1//'

ADD COMMENT • link 5.7 years ago by lakhujanivijay 5.9k

0

Entering edit mode

This would only address .1s. We should account for .\d+, right?

ADD REPLY • link 5.7 years ago by Ram 45k

score 0 · Answer 3 · 2019-07-13

0

Entering edit mode

5.7 years ago

vin.darb ▴ 300

If the gene is always on the first column:

sed 's/\.[0-9]\{1,\}//' yourfile.txt

should work

ADD COMMENT • link 5.7 years ago by vin.darb ▴ 300

0

Entering edit mode

it will remove other (.) from other column.

ADD REPLY • link 5.7 years ago by harry ▴ 40

0

Entering edit mode

It's weird because I try it and it don't remove the others (.) because I didn't put the 'g' global flag after the last slash

ADD REPLY • link 5.7 years ago by vin.darb ▴ 300

0

Entering edit mode

It might if the first . it encounters is not the transcript version. The awk solution, or yours modified to include an anchor and a first-word ensuring regex would be safe.

ADD REPLY • link 5.7 years ago by Ram 45k