Question

Tool:csvtk - a cross-platform, efficient, practical and pretty CSV/TSV toolkit

10

Entering edit mode

8.5 years ago

shenwei356 8.7k

Hi all,

I'd like to share my another practical toolkit, csvtk, after introducing SeqKit yesterday.

Documents: http://bioinf.shenwei.me/csvtk/ (Usage and Tutorial)
Source code: https://github.com/shenwei356/csvtk
Latest version:

Introduction

Similar to FASTA/Q format in field of Bioinformatics, CSV/TSV formats are basic and ubiquitous file formats in both Bioinformatics and data sicence.

People usually use spreadsheet softwares like MS Excel to do process table data. However it's all by clicking and typing, which is not automatically and time-consuming to repeat, especially when we want to apply similar operations with different datasets or purposes.

You can also accomplish some CSV/TSV manipulations using shell commands, but more codes are needed to handle the header line.

csvtk is convenient for rapid data investigation and also easy to be integrated into analysis pipelines. It could save you much time of writing Python/R scripts.

Features

Cross-platform (Linux/Windows/Mac OS X/OpenBSD/FreeBSD)
Light weight and out-of-the-box, no dependencies, no compilation, no configuration
Fast, multiple-CPUs supported
Practical functions supported by N subcommands
Support STDIN and gziped input/output file, easy being used in pipe
Most of the subcommands support unselecting fields and fuzzy fields, e.g. -f "-id,-name" for all fields except "id" and "name", -F -f "a.*" for all fields with prefix "a.".
Support common plots (see usage)

Subcommands

25 subcommands in total.

Information

headers print headers
stat summary of CSV file
stat2 summary of selected number fields

Format conversion

pretty convert CSV to readable aligned table
csv2tab convert CSV to tabular format
tab2csv convert tabular format to CSV
space2tab convert space delimited format to CSV
transpose transpose CSV data
csv2md convert CSV to markdown format

Set operations

head print first N records
sample sampling by proportion
cut select parts of fields
uniq unique data without sorting
freq frequencies of selected fields
inter intersection of multiple files
grep grep data by selected fields with patterns/regular expressions
filter filter rows by values of selected fields with artithmetic expression
filter2 filter rows by awk-like artithmetic/string expressions
join join multiple CSV files by selected fields

Edit

rename rename column names
rename2 rename column names by regular expression
replace replace data of selected fields by regular expression
mutate create new columns from selected fields by regular expression

Ordering

sort sort by selected fields

Ploting

plot see usage
- plot hist histogram
- plot box boxplot
- plot line line plot and scatter plot

Download and install

csvtk is implemented in Golang programming language, executable binary files for most popular operating systems are freely available in release page.

Just download compressed executable file of your operating system, and uncompress it with tar -zxvf *.tar.gz command.

Or install via conda

conda install -c bioconda csvtk

Learn More

Detailed usage of subcommands
Tutorial based on OTU table analysis
Some answer sovled by csvtk on Biostars

CSV Golang TSV • 6.2k views

ADD COMMENT • link 20 months ago by shenwei356 8.7k

0

Entering edit mode

csvtk has 25 subcommands now. Why not give it a try?

One can accomplish some CSV/TSV manipulations using shell commands, but more codes are needed to handle the header line.

ADD REPLY • link 8.1 years ago by shenwei356 8.7k

score 1 · Answer 1 · 2017-03-21

1

Entering edit mode

8.1 years ago

Ram 45k

This will be super useful. What would it take to push it to homebrew?

ADD COMMENT • link 8.1 years ago by Ram 45k

0

Entering edit mode

Someone called this too, I may push it when I'm free. But I use Linux and I'm not familiar with Mac OS X. That's would be great if someone pushes it.

ADD REPLY • link 8.1 years ago by shenwei356 8.7k

0

Entering edit mode

Let's discuss this on Twitter soon - I'm on vacation now. I think the author cannot recommend their own tool, so we can check out the procedure later.

ADD REPLY • link 8.1 years ago by Ram 45k

1

Entering edit mode

FYI, versions 0.4.4 thru 0.7.0 are available for Linux & OSX via bioconda

ADD REPLY • link 8.1 years ago by Ryan Dale 5.0k

score 1 · Answer 2 · 2023-08-15

1

Entering edit mode

20 months ago

shenwei356 8.7k

After 7 years, the number of subcommands is doubled, with more functions and features added.

Today, I just released another new version. It definitely worths a try.

ADD COMMENT • link 20 months ago by shenwei356 8.7k