Size of BAM file reduces after sorting with samtools
1
5
Entering edit mode
9.4 years ago

I have 3 BAM files of the same specie, each of ~7GB, from three experimental runs. I merged the three BAM files to produce a single 22 GB bam file, using samtools merge -r option. Then I sorted this merged bam file with samtools sort, and i got 11 GB merged bam. Is is possible to reduce the size of merged bam file by 50%??

genome sequence next-gen samtools sort • 11k views
ADD COMMENT
4
Entering edit mode

Yes

ADD REPLY
3
Entering edit mode

You can use samtools flagstat .bam to check read counts etc. for the different files.

ADD REPLY
14
Entering edit mode
9.4 years ago

When you sort by coordinate, you bring reads with similar sequences next to each other, allowing the compression algorithm to see more compressible content. It is worth checking, though, that the number of sequences is what you expect using samtools flagstat or simply samtools view and a wc -l.

ADD COMMENT

Login before adding your answer.

Traffic: 1766 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6