Tool:filterx : fast and efficient filter tool written in Rust for csv,fasta,fastq,gff,vcf,sam files.
1
2
Entering edit mode
9 weeks ago
dwpeng ▴ 110

Document: https://filterx.dwpeng.com

Github: https://github.com/dwpeng/filterx

enter image description here

Features

  • Filter lines by column-based expression
  • Support multiple input formats e.g. vcf/sam/fasta/fastq/gff/bed/csv/tsv
  • Cross-platform support
  • Easy to install
  • Rich documentations

More in document website

csv fasta fastq • 981 views
ADD COMMENT
2
Entering edit mode

Prior thread for this tool for reference to discussion that happened there: A simple and lightweight tool for filtering csv file wirtten in Rust

ADD REPLY
0
Entering edit mode

v0.0.2 has full platform supported now :) !

https://github.com/dwpeng/filterx/releases/tag/v0.0.2

enter image description here

ADD REPLY
0
Entering edit mode

filterx v0.2.3 now supports installation using pip.

pip install filterx
ADD REPLY
0
Entering edit mode

A few comments, I will try to be very constructive.

  1. Too much empty space, too many logos etc. The tool takes too much time to get going. The main homepage has no information, one has to click to even see the first example. You lose readers right there. Then when someone click the quickstart, it is again way too sparse. I have to scroll half the page down to find the first example. I don't care about installation, if I don't even know that I want to use a tool. Put some example on the first page, show it off so that people want to use the tool.
  1. I have in the past developed tools that work with CSV my biggest regret was that the defaults did not automatically deal with headers. CSV files have headers, it is very very rare that they don't. The utility of CSV is that you export from Excel and Excel sheets always have headers. I would recommend that the defaults assume headers and that the flags -H turn off headers.

  2. The comparison to bioawk seems misleading, the main reason for using bioawk is that it can read multiple bioinformatics formats SAM,BED,GFF, FASTQ, FASTA, here only FASTA and FASTQ seem supported.

ADD REPLY
1
Entering edit mode
5 weeks ago
dwpeng ▴ 110

filterx v0.2.11 now has more builtin functions.

Installation

Pip

pip install filterx --upgrade

Cargo

cargo install filterx

Github Realease

download from release

Use filterx info --list to list all builtin functions.

 1 del
 2 alias
 3 cast
 4 dup
 5 fill
 6 is_null
 7 rename
 8 select
 9 print
10 sort
11 col
12 drop_null
13 header
14 len
15 upper
16 lower
17 slice
18 replace
19 strip
20 rev
21 width
22 trim
23 gc
24 revcomp
25 to_fasta
26 to_fastq
27 qual
28 phred
29 hpc
30 abs
31 head
32 limit
33 tail

Use filterx info <function_name> to get more information about a function.
ADD COMMENT

Login before adding your answer.

Traffic: 2066 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6