How to remove the last character "," from each row in bed file?
5
0
Entering edit mode
8.6 years ago
bright602 ▴ 50
1   903641  927394  224 C1orf170,
1   927395  936954  225 RP11-54O7.17,HES4,
1   943677  957199  228 RP11-54O7.11,ISG15,AGRN,
1   957200  974400  229 AGRN,
1   1005127 1034268 234 RNF223,C1orf159,
1   1049052 1062659 239 C1orf159,
1   1069046 1083958 242 RP11-465B22.5,
1   1096739 1107115 246 RP11-465B22.8,MIR429,MIR200B,MIR200A,
1   1107116 1109732 247 TTLL10,
1   1109733 1122642 248 TTLL10-AS1,TTLL10,
Assembly sequencing • 1.6k views
ADD COMMENT
4
Entering edit mode
8.6 years ago

Using sed

sed 's/,$//' file > file2
ADD COMMENT
3
Entering edit mode
8.6 years ago

although enough answers have already been stated, and since this question does completely sound like a homework assignment, I will try to top them all: does that "bed file" come from a previous gene annotation process which is printing a comma character at the end of each gene so, oh my, you need to remove the last gene's comma character?

my first choice for removing last comma characters would be Sukhdeep's sed -i 's/,$//' file.bed, but if the situation is the one I've just described in the previous paragraph then I would better suggest you to correct your previous script, as it's not a good idea to leave imperfections inside a script and try to correct them afterwards, specially if you're learning to code.

and as a final top guess... would that annotation script be written in perl, where the gene information is being printed in a similar manner to

foreach $gene (@genes) { print "$gene," }

? if so, you may correct your problem by using the join function like

print join ",", @genes

the output of your annotation script won't have that comma at the end of each line.

ADD COMMENT
1
Entering edit mode
8.6 years ago
gearoid ▴ 200

Remove one trailing ',' from the end of every line in a file:

while read line; do
  echo ${line%,}
done < input.file > output.file

(If I understand you correctly)

ADD COMMENT
1
Entering edit mode
8.6 years ago

A more general approach with awk lets you clean any field:

$ awk '{ print $1"\t"$2"\t"$3"\t"$4"\t"substr($5, 1, length($5)-1); }' in.bed > out.bed

For instance, if you had a BED file with the ID in the fourth column and the score in the fifth column, and you wanted to clean the fourth column:

$ awk '{ print $1"\t"$2"\t"$3"\t"substr($4, 1, length($4)-1)"\t"$5; }' in.bed > out.bed
ADD COMMENT

Login before adding your answer.

Traffic: 1987 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6