Retrieve header of column (field) in which data is above or equals a number
3
0
Entering edit mode
7.5 years ago
salamandra ▴ 550

Imagine I have the following file with a header (A,B,C,D,E,F) and one row of data:

A    B  C   D   E    F
-20  0  4   20  10   50

I want to extract the header names corresponding to the fields where data is >= 20 with the command line, so that I get:

D F

How can I do this? (with command line or awk)

Commandline awk • 1.6k views
ADD COMMENT
3
Entering edit mode
7.5 years ago

Just transpose (rotate) the table using GNU datamash and filter using awk:

cat data.txt \
    | datamash transpose \
    | awk '$2 >= 20' \
    | cut -f 1 | paste -s -d ' '

If there are more than one row of data, you need csvtk:

$ cat data.txt 
A       B       C       D       E       F
-20     0       4       20      10      50
-20     20      4       10      0       30


$ cat data.txt \
    | csvtk transpose -t \
    | csvtk filter -H -t --any -f '2,3>=20' \
    | cut -f 1 | paste -s -d ' '
B D F

Where --any -f '2,3>=20' means finding rows in which value of any column is >= 20.

No --any:

$ cat data.txt \
        | csvtk transpose -t \
        | csvtk filter -H -t -f '2,3>=20' \ 
        | cut -f 1 | paste -s -d ' '
F
ADD COMMENT
1
Entering edit mode
7.5 years ago

My first guess with command line tools would be to transpose your file, awk for selecting those above your cutoff, cut for getting the header (which is the first column after transposing).

But my language of choice is Python, so I would do the following:
The code is untested, so let me know if something doesn't work as expected.

(I wrote this slightly more verbose than I would do for myself, but hopefully this makes it more clear what's going on.)

ADD COMMENT
0
Entering edit mode
7.5 years ago
echo -e  "A    B  C   D   E    F\n-20  0  4   20  10   50" | \
awk 'NR==1 {split($0,header);next;} {for(i=1;i<=NF;++i) if($i>=20) printf("%s ",header[i]);} END{printf("\n");}'
D F
ADD COMMENT

Login before adding your answer.

Traffic: 1808 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6