Can Zram Be Used To Reduce The Hardware Requirements For De Novo Assembly?
2
1
Entering edit mode
12.0 years ago
benjwoodcroft ▴ 170

Hi,

I was just wondering whether anyone has had any experience using the Linux kernel module zRAM with assembly (or other tasks)? The basic idea of zram is that it compresses RAM so that more data can be stuffed into the same hardware, without having to resort to comparatively slow operation of swapping to disc.

Given that having lots of RAM is a requirement limits some assemblies, can zram be used to make assembly faster? Are de-bruijn graphs generally compressible? Do de novo assemblers already do compression on the graph anyway?

It is trivially easy to install, at least if you are running Ubuntu http://www.webupd8.org/2011/10/increased-performance-in-linux-with.html

There's a more in-depth article here https://lwn.net/Articles/454795/

Thanks, ben

assembly • 4.1k views
ADD COMMENT
2
Entering edit mode
12.0 years ago

I think zram is a glorified RAM DISK, where the computer pretends some RAM is actually a disk.

This is not going to remedy the problem of not having enough RAM.

The opposite approach of using an SSD as a RAM drive (where virtual memory spans the entire SSD) has been used with some success.

ADD COMMENT
0
Entering edit mode

Thanks for the answer. You're right that it pretends some RAM is a disk, but I think that is beside the point. Because it is compressed, and it swaps to the zram disk before the hdd disk, the assembler can fit more data into the same RAM.

That doesn't entirely remedy the problem because the graph can't be compressed too much, I guess much less than an order of magnitude. Still, zram has no big negative consequences that I know of, so why not use it?

ADD REPLY
1
Entering edit mode
12.0 years ago
Ryan Thompson ★ 3.6k

It may help, and I would expect that at the very least it won't hurt. I expect it would depend on what data structures the assembler uses and how it stores them in memory. I highly doubt that any de novo assembler implements its own in-memory data compression.

You should definitely try it out. I run zRAM on all the linux devices that I control. The main benefit that I see is that when my process uses up all the system's memory, zRAM gives me a "grace period" where the computer is still responsive enough that I can kill the process before it starts swapping to disk, whereas without zRAM it would hit the on-disk swap and become completely unresponsive as soon as RAM fills up.

ADD COMMENT
1
Entering edit mode

For SGA and fermi, data are sort of compressed by nature. That is partly why they use less memory. Of course you can compress the data a little further, but not much further and the performance hit would be significant.

ADD REPLY
0
Entering edit mode

You would only incur a performance hit if the system needed to swap, and nothing that zRAM can do will be slower than swapping to disk. So I don't think there would ever be a noticable performance hit relative to not having zRAM enabled, even when it does effectively nothing.

ADD REPLY
0
Entering edit mode

Thanks guys. Maybe worth noting that by default, zRAM doesn't compress memory where it isn't compressible by 50% or more, though this is tuneable.

ADD REPLY
0
Entering edit mode

For de bruijn graph, please check the sparse kmer scheme used in SOAPdenovo2 (recently published in BMC GigaScience). It enables to assemble a human genome with only 60GB memory using DBG, utilizing the concept that, actually for a large portion of a genome, it's unique and could be represented by compressed data structure.

ADD REPLY

Login before adding your answer.

Traffic: 1960 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6