how to select with unix a column by the column name in a txt files
4
0
Entering edit mode
7.6 years ago
AQ7 ▴ 30

Goodevening everyone,

I have a big txt files with a large number of column. I cannot open it with a text editor and I would like to select a column by its name.... how can i select all the rows for a specific column? some grep options? thank you for your help

unix • 36k views
ADD COMMENT
0
Entering edit mode

It would be useful if you could provide few lines from the file.

ADD REPLY
0
Entering edit mode

A web search will lead you to answers e.g. this.

ADD REPLY
3
Entering edit mode

I think he/she wants to select by character, not by field number, so he/she needs a way to get to know what is the column number of the word he/she seeks in the header.

ADD REPLY
1
Entering edit mode

The name/column number can be easily found by multiple ways (head etc) without having to open the file in a text editor.

Fundamental issue with many new participants on Biostars appears to be that they do not seem to make any effort on their own to find answers for questions. Providing ready to run command lines gets the job done for posters but it may not teach them usable skills. Other mods have discussed this in other threads but it seems to be a lost cause.

ADD REPLY
0
Entering edit mode

@genomax2, in addition, after getting read-to-work solution, few users extend their question, for example in this thread - C: how to select with unix a column by the column name in a txt files. This is not specific to this thread but there are multiple.

ADD REPLY
0
Entering edit mode

Unfortunately they do so without going back and editing original question. As a result some of these threads become chat sessions (Having participated in some of these I can't deny responsibility).

Wish it would be possible to flag (for mods or users) some of these comments as supplemental questions so future visitors can figure out where question/answer pairs are.

ADD REPLY
7
Entering edit mode
7.6 years ago
venu 7.1k

Here is how

head -1 file.txt | tr '\t' '\n' | cat -n | grep "YOUR_COLUMN_NAME"

you'll get your column number as follows, let's say 120

  120 YOUR_COLUMN_NAME

Then simply use cut

cut -f 120 file.txt > new_file.txt

Take care of column separators (tab, ,, space ..etc).

P.S: You can also use R to do this

Example:

dat=read.delim("file.txt", header=TRUE, sep="\t", stringsAsFactors=FALSE)

dat.sub=subset(dat, select=c("YOUR_COLUMN_NAME_1", "YOUR_COLUMN_NAME_2"))

# write dat.sub to output file
ADD COMMENT
3
Entering edit mode
7.6 years ago

If you know the column number use awk:

awk '{print $NUMBER}' filename.txt

where NUMBER is the column number.

Quick workaround to get the column number, assuming you have a header made of just one line:

WORD="yourwordhere"; head -n1 filename.txt | tr "\t" "\n" | grep -n $WORD
ADD COMMENT
0
Entering edit mode

Thanks a lot it work peferctly following your instruction. Do you know if i can maintain the rows as well like to have

 V1

1
2
3 4

and not only column without the row? thanks a lot for your precious help

ADD REPLY
0
Entering edit mode

For more clarity, I would suggest you to edit your original question adding a final paragraph with this last question of yours. Perhaps expanding a little bit what you want to achieve because I didn't really get it!

ADD REPLY
1
Entering edit mode
7.6 years ago

Using csvtk or csvkit, or miller or xsv.... csvtk is a cross-platform, efficient, practical and pretty CSV/TSV toolkit in Golang. Hope you enjoy it.

For a CSV file:

$ cat 3.csv 
id,name,hobby
1,bar,baseball
2,bob,basketball
3,foo,football
4,wei,programming

Select column(s) by name(s):

$ csvtk cut -f name,id 3.csv 
name,id
bar,1
bob,2
foo,3
wei,4

If your input file is tab-delimited, just add the option -t or --tabs:

# yes, it can read from stdin
$ cat 3.csv | csvtk csv2tab 
id      name    hobby
1       bar     baseball
2       bob     basketball
3       foo     football
4       wei     programming

$ cat 3.csv | csvtk csv2tab | csvtk cut -t -f name,id
name    id
bar     1
bob     2
foo     3
wei     4
ADD COMMENT
0
Entering edit mode
7.6 years ago
IP ▴ 770

awk '{print $column_number}' your_file.txt

For example, if you want to get column 3:

awk {print $3}' your_file.txt

ADD COMMENT

Login before adding your answer.

Traffic: 2049 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6