I would use a combination of tools: seqtk, awk and cut. Each tool can be piped, so that the outcome of one tool is used by the second tool.
Code:
# Step 1: make a list of sequence names that are within the length range
seqtk comp my.fastq | awk '{ if (($2 >= 26) && ($2 <= 32)) { print} }' | cut --fields 1 > selected-sequences-names.list
# Step 2: subset the fastq file for those sequences only
seqtk subseq my.fastq selected-sequences-names.list > subsetted.fq
Explanations:
seq comp my.fastq
will print a summary for each sequence found in the fastq file. The first column is the name of the sequence, the second column is the length of the sequence. See an example below:
seqtk comp my.fastq | head
ABCD-0123:987:GW1805231090:1:1101:13250:1696 27 52 14 21 63 0 0 0 4 0
ABCD-0123:987:GW1805231090:1:1101:20740:1766 27 46 19 21 64 0 0 0 10 0
ABCD-0123:987:GW1805231090:1:1101:16691:3004 56 44 14 22 10 0 0 0 3 0
awk '{ if (($2 >= 26) && ($2 <= 32)) { print} }'
will print only the rows in which the second column (here coded $2) has values higher or equal to 26 ($2 >= 26) and lower or equal to 32 ($2 <= 32). See an example below:
seqtk comp my.fastq | awk '{ if (($2 >= 26) && ($2 <= 32)) { print} }' | head
ABCD-0123:987:GW1805231090:1:1101:13250:1696 27 52 14 21 63 0 0 0 4 0
ABCD-0123:987:GW1805231090:1:1101:20740:1766 27 46 19 21 64 0 0 0 10 0
ABCD-0123:987:GW1805231090:1:1101:15595:1784 27 51 27 34 38 0 0 0 12 0
cut --fields 1
will print the names of those sequences. They are found in the first column (here coded --fields 1). See an example below:
seqtk comp my.fastq | awk '{ if (($2 >= 26) && ($2 <= 32)) { print} }' | cut --fields 1 | head
ABCD-0123:987:GW1805231090:1:1101:13250:1696
ABCD-0123:987:GW1805231090:1:1101:20740:1766
ABCD-0123:987:GW1805231090:1:1101:15595:1784
To keep the names of the those sequences, the single greater-than (>) sign at the end of the command redirects the current output (here, the names of the sequences) into a new file (here, called selected-sequences-names.list)
seqtk comp my.fastq | awk '{ if (($2 >= 26) && ($2 <= 32)) { print} }' | cut --fields 1 > selected-sequences-names.list
head selected-sequences-names.list
ABCD-0123:987:GW1805231090:1:1101:13250:1696
ABCD-0123:987:GW1805231090:1:1101:20740:1766
ABCD-0123:987:GW1805231090:1:1101:15595:1784
To actually subset the original fastq file for only those sequences, one command from seqtk is needed, and the new file (subsetted.fq) contains just what you wanted.
seqtk subseq my.fastq selected-sequences-names.list > subsetted.fq
which post ?
Filtering Fastq Sequences Based On Lengths
This one. Didn't try the biopieces option though due to issued I had with one requuired library.