Is it normal a substantial reduction of BAM files after ordering ?
2
0
Entering edit mode
6.8 years ago

I am getting BAM files with Bismark mapping with bowtie2. This gives you an unordered BAM file Thus, I am sorting the BAM file with the command

samtools sort -O bam -T tmp_ -o Name_sorted.bam Unordered.bam

The unordered bam file can have a 2,6Gb size, but the ordered one drops down to 1,2Gb. The same happens with all the bam files I am ordering. They are reduced to a half.

Is that normal? Is that caused by a different compression level ?

In any case, makes me feel sort of uncomfortable..

BAM samtools • 1.7k views
ADD COMMENT
1
Entering edit mode

Use samtools flagstat to check the number of reads. If they are the same, you're fine and it is due to compression. How did you create the unsorted bam file?

ADD REPLY
0
Entering edit mode

I mentioned that. I am mapping BS-Seq sequences with Bismark

ADD REPLY
0
Entering edit mode

Yes, it is normal. Sorting can save a lot of space.

ADD REPLY
2
Entering edit mode
6.8 years ago

Sorting helps to give a better compression ratio. There's an old discussion about it here, or a tip of the day here

edit: I should also mention that you'll see a much better compression ratio with RNA seq than with DNA, as RNA seq should have a lot of repeated sequences.

ADD COMMENT
0
Entering edit mode
6.8 years ago

You are right. They have the same number of reads

ADD COMMENT

Login before adding your answer.

Traffic: 2630 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6