I need to merge two fastq.gz files. I could decompress the files and do "cat" on them, but is there any faster way? can I use
gzcat file1.fastq.gz file2.fastq.gz | gzip > merged.fastq.gz
?
I need to merge two fastq.gz files. I could decompress the files and do "cat" on them, but is there any faster way? can I use
gzcat file1.fastq.gz file2.fastq.gz | gzip > merged.fastq.gz
?
See SO : Fast Concatenation of Multiple GZip Files
A gzip file consists of a series of "members" (compressed data sets). [...] The members simply appear one after another in the file, with no additional information before, between, or after them.
cat file1.gz file2.gz file3.gz > allfiles.gz
This worked well for me. Note that you can confirm that this works in most cases by doing something like the following:
cat file1.gz file2.gz file3.gz > allfiles-cat.gz
zcat file1.gz file2.gz file3.gz | gzip -c > allfiles-zcat.gz
zcat allfiles-cat.gz | md5sum
zcat allfiles-zcat.gz | md5sum
The resulting hash/message digests should be identical.
My experience is that the zcat method is around 40x slower, but the cat method's resulting file is a few percent bigger depending on your the gzip parameters used in the methods.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
have a try
seqtk mergepe
A: Merging two fastq filesHow to find and merge all fastq files into one log file