What are the differences between GATK's reducebam and CRAM?
Are these essentially two different implementations towards the same goal?
What are the differences between GATK's reducebam and CRAM?
Are these essentially two different implementations towards the same goal?
Reads in a reduced BAM are made up of differences between the original reads and the reference sequence used for alignment whereas CRAM format preserves placement and bases of original reads. Basically lossless CRAM can be restored to original BAM. Reduced BAM approach allows to analyze more sample at once than regular BAM but the data cannot be re-aligned afaik, therefore it is reference biased.
To use CRAM files you would need a special software: java implementation https://github.com/enasequence/cramtools or C implementation done by James Bonfield from Sanger Institute http://staden.svn.sourceforge.net/
There is a notion to make CRAM files compatible with existing tools using common CRAM-BAM API. For example picard tools should work on CRAM and BAM files given the cramtools jar is used.
The GATK produces a file that is still a BAM.
CRAM is another format, your tools (samtools) migth not be able to handle it.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.