concatenating different columns of different files bash script
2
0
Entering edit mode
7.1 years ago
Fatima ▴ 1000

I have a bunch of files, I want to concatenate 5th column of these files, but I don't know the number of files that find will find so I don't know what numbers I should put after cut -f

file names 456Ecoli.bed, 568Ecoli.bed, ..... Each file has 6 columns

The command doesn't work at all, I just wrote it to give you some idea of what I need.

find -name '*Ecoli.bed' -exec  paste {} > All_Ecoli.bed; | cut -f  5,11,17, ....

Each bed file is something like

chr1  102 203 gene1 0.05  + 
chr1   300 403 gene2  0.6  +
bash • 5.6k views
ADD COMMENT
1
Entering edit mode
$ cut -f5  *Ecoli.bed
$ awk '{print $5}' *Ecoli.bed
ADD REPLY
0
Entering edit mode

Thank you, but I want the output to have one column for every file. Something like when we use cut -f 5,11,17

ADD REPLY
0
Entering edit mode

Have you tried running either script? Create a few lines of model output and show us how it's different from what these scripts produce.

ADD REPLY
1
Entering edit mode

copy/pasted from here:https://www.linuxquestions.org/questions/linux-newbie-8/merge-columns-from-multiple-files-851336/. Please upvote the OP in that forum:

$ awk '{_[FNR]=(_[FNR] OFS $5)}END{for (i=1; i<=FNR; i++) {sub(/^ /,"",_[i]); print _[i]}}' *Ecoli.bed.

This would give you output in columns instead of rows.

ADD REPLY
0
Entering edit mode

Works like a charm, could you help me to add the title of each file above each column?

ADD REPLY
2
Entering edit mode
7.1 years ago
Ram 44k
find . -name "*Ecoli.bed" | xargs -I v_file cut -f5 v_file >>All_Ecoli.bed

Or if you want to do it faster and don't care about the processing order of files:

find . -name "*Ecoli.bed" | parallel -I v_file cut -f5 v_file >>All_Ecoli.bed
ADD COMMENT
0
Entering edit mode

Thank you, but I want the output to have one column for every file. Something like when we use cut -f 5,11,17

ADD REPLY
0
Entering edit mode

I'm sorry, but what do you think this script does differently than your need?

I've made a small edit to ensure there is no overwriting (just in case)

ADD REPLY
1
Entering edit mode

Yes, I ran all the commands. I mean something like

score1fromfile1   score1fromfile2  score1formfile3 
score2fromfile1   score2fromfile2  score2formfile3 
score3fromfile1   score3fromfile2   score3formfile3

But your command gives me:

score1fromfile1   
score2fromfile1   
score3fromfile1 
score1fromfile2   
score2fromfile2   
score3fromfile2 
score1fromfile3  
score2fromfile3  
score3fromfile3
ADD REPLY
1
Entering edit mode

Thank you, I did not foresee that, sorry

ADD REPLY
1
Entering edit mode
7.1 years ago

Using BEDOPS bedmap and bedops:

$ bedops --everything *Ecoli.bed | awk '($6=="+")' - > Ecoli.union.for.bed
$ bedops --everything *Ecoli.bed | awk '($6=="-")' - > Ecoli.union.rev.bed
$ bedmap --echo --echo-map-score --exact --multidelim '\t' Ecoli.union.for.bed > answer.for.bed
$ bedmap --echo --echo-map-score --exact --multidelim '\t' Ecoli.union.rev.bed > answer.rev.bed
$ bedops --everything answer.*.bed > answer.bed
$ cut -f7- answer.bed > answer.txt
ADD COMMENT

Login before adding your answer.

Traffic: 2501 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6