Entering edit mode
17 months ago
Divon
▴
230
I am excited to announce the launch of our new version of Genozip - Genozip 15 - a genomic compressor for FASTQ, BAM, VCF and many other genomic formats.
The key new capability in version 15 is our patent-pending method of losslessly co-compressing BAM files with their related FASTQ files (which we call Genozip Deep). By exploiting the inherent information redundancies between BAM and FASTQ, we are able to roughly double the compression gains (the benchmark uses this dataset) :
Genozip is a commercial product, but as always, we are happy to provide it free for academic use.
More details: www.genozip.com
When a BAM file is compressed and decompressed with GenoZip, will the md5sum be identical?
Yes, and you can also have genozip calculate and display the MD5 for you with the command line option --md5.
Note that .bam is compressed internally with BGZF (a type of gzip) - the MD5 calculated is the one of the actual data, after gzip decompression: zcat file.bam | md5sum