have around 80 bed files with 1st 3 columns (example : X2_example.bed, where X2 is the gene name) and I want to add a 4th column with gene name and rename the file (example attached: X2_example_edited.bed, Y2_example_edited.bed and so on..), and then merge these files together to create 1 bed file.
I can add the 4th column with gene name and save the file with a different name with the code
sed 's/$/\tX2/' < X2_example.bed > X2_example_edited.bed
This is the generated bed file
chr17 42276210 42276219 X2
chr17 42297938 42297947 X2
chr17 42276210 42276219 X2
chr17 42297938 42297947 X2
But I have to do this separately for each bed file. This there a way I can extract the gene name from the name of the file (eg. X2 from X2_example.bed) and then add that to the 4th column of the bed file and save it as X2_example_edited.bed.
I can extract the gene name from the file name echo "X2_example.bed" | awk -F'[_.]' '{print $1}
However, as I have too many files I am looking for a way to generate a loop to automate this.
Also I need to merge all the generated bed file which I can do by
cat *_edited.bed >output.bed
However, I am having an error (see attached example: output.bed), the last line of 1st file and 1st line of next file are on same line.
chr3 18467066 18467075 Y2chr17 42276210 42276219 X2
I know this must be a very basic thing, but I am new to this analysis and have limited knowledge. Thanks in advance
I am getting a warning find: warning: you have specified the -maxdepth option after a non-option argument -name, but options are not positional (-maxdepth affects tests specified before it as well as those specified after it). Please specify options before other arguments.
Perhaps move
-maxdepth 1
before-name
, so:etc.
Hi, It worked despite the error. Thank you, got the result I needed.