Entering edit mode
9 weeks ago
dwpeng
▴
110
Document: https://filterx.dwpeng.com
Github: https://github.com/dwpeng/filterx
Features
- Filter lines by column-based expression
- Support multiple input formats e.g. vcf/sam/fasta/fastq/gff/bed/csv/tsv
- Cross-platform support
- Easy to install
- Rich documentations
More in document website
Prior thread for this tool for reference to discussion that happened there: A simple and lightweight tool for filtering csv file wirtten in Rust
v0.0.2 has full platform supported now :) !
https://github.com/dwpeng/filterx/releases/tag/v0.0.2
filterx v0.2.3 now supports installation using pip.
A few comments, I will try to be very constructive.
I have in the past developed tools that work with CSV my biggest regret was that the defaults did not automatically deal with headers. CSV files have headers, it is very very rare that they don't. The utility of CSV is that you export from Excel and Excel sheets always have headers. I would recommend that the defaults assume headers and that the flags -H turn off headers.
The comparison to bioawk seems misleading, the main reason for using bioawk is that it can read multiple bioinformatics formats SAM,BED,GFF, FASTQ, FASTA, here only FASTA and FASTQ seem supported.