Help required to write a one liner to gunzip (and retain gzipped files) for multiple files.
3
0
Entering edit mode
8.0 years ago

Hi,

I am trying to learn writing one liners using shell/awk. To begin with I want to perform the following operation using a one liner.

Scenario: A directory containing multiple fastq files with the extension .fq.gz.

Objective To automate gunzipping all the files and retain the gzipped files i.e. my directory will have *.fq.gz as well as *.fq files.

What I tried so far

I broke the problem into smaller pieces like this:

  1. find the files with extension *.fq.gz
  2. loop over the files
  3. pass files one by one to gunzip command
  4. use the option -cto retain the *.fq.gz files
  5. the output files should have extension *.fq, so I split the file name by a "." to have 3 parts from the original file name i.e.

original file

test.fq.gz

after splitting

test [part 1]
fq [part 2]
gz [part3 ]

then take part 1 and concatenate ".fq" . Finally I came up with below one liner:

for i in find -name  "*.fq.gz"; do gunzip -c $i > awk '{split($i,a,"."); print a[1] ".fq"}' ; done

However, it is not working, here is the error:

gunzip: find.gz: No such file or directory
gunzip: {split($i,a,"."); print a[2] ".fq"}.gz: No such file or directory
gunzip: invalid option -- e

PS: I googled a lot without much success. This may be a very easy task but I am learning to automate using one liners.

fastq gunzip • 3.6k views
ADD COMMENT
2
Entering edit mode
8.0 years ago
dyollluap ▴ 310

Try

for file in *.gz; do gunzip -k $file ; done

You shouldn't need awk for this situation. By default gunzip will name the uncompressed file the same as the .gz , just minus the .gz I think you want the gunzip -k flag to keep the original fq.gz (see gunzip --man)

ADD COMMENT
0
Entering edit mode

Oh .. I forgot -k. Oh .. I forgot -k.

ADD REPLY
0
Entering edit mode

No -k option. Do I have an older version?

for file in *.gz; do gunzip -k $file ; done
gunzip: invalid option -- k
gunzip 1.3.5
(2002-09-30)
usage: gunzip [-cdfhlLnNrtvV19] [-S suffix] [file ...]
 -c --stdout      write on standard output, keep original files unchanged
 -d --decompress  decompress
 -f --force       force overwrite of output file and compress links
 -h --help        give this help
 -l --list        list compressed file contents
 -L --license     display software license
 -n --no-name     do not save or restore the original name and time stamp
 -N --name        save or restore the original name and time stamp
 -q --quiet       suppress all warnings
 -r --recursive   operate recursively on directories
 -S .suf  --suffix .suf     use suffix .suf on compressed files
 -t --test        test compressed file integrity
 -v --verbose     verbose mode
 -V --version     display version number
 -1 --fast        compress faster
 -9 --best        compress better
    --rsyncable   Make rsync-friendly archive
 file...          files to (de)compress. If none given, use standard input.
Report bugs to <bug-gzip@gnu.org>.
gunzip: invalid option -- k
gunzip 1.3.5
(2002-09-30)
ADD REPLY
0
Entering edit mode

If you use a gunzip with the -k option, why not simply do gunzip -k *.fq.gz?

ADD REPLY
0
Entering edit mode
8.0 years ago

No find:

 for f in *.fq.gz; do gzip -d -c $f > ${f/\.gz/}; done

With find:

for f in $( find ./ -name "*.fq.gz" ); do gzip -d -c $f > ${f/\.gz/}; done

I tried find -exec but failed.

find ./ -name "*.fq.gz" -exec gzip -d -c {} > ${"{}"/\.gz/} \;
bash: ${"{}"/\.gz/}: bad substitution

Ref: Advanced Bash-Scripting Guide:: Manipulating Strings

ADD COMMENT
0
Entering edit mode

Can you please explain

 for f in *.fq.gz; do gzip -d -c $f > ${f/\.gz/}; done
  1. how is gzip helping here instead of gunzip ?
  2. ${f/\.gz/} what is this part ?
ADD REPLY
0
Entering edit mode

gzip can compress and decompress.

see the ref. string replacing in shell.

ADD REPLY
0
Entering edit mode
8.0 years ago

Let me throw in a gnu-parallel solution, which isn't so different from the other solutions:

ls *.gz | parallel 'gunzip -k {}'
ADD COMMENT

Login before adding your answer.

Traffic: 2034 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6