easyRNASeq memory limit problem
1
0
Entering edit mode
10.7 years ago
anon ▴ 50

I have a .bam file that greater than 7 Gb - contains human RNASeq data. I used the easyRNASeq package for processing it (I only want the read count table), but the code execution is stucked, because of the memory limit. I don't know, how to manage processing such big .bam file with the easyRNASeq function. Is there any parameter settings that might help?

count.table <- easyRNASeq( filesDirectory=getwd(),
    filenames=bamfiles,
    organism="Hsapiens",
    chr.sizes=seqlengths(Hsapiens),
    gapped = TRUE,
    annotationMethod="biomaRt",
    format="bam",
    count=c('genes',
    summarization=c("geneModels"))
write.table(count.table, file="sample_readcount.txt",sep="\t",row.names=FALSE)

Thanks in advance!

bam read-count easyRNASeq bioconductor • 2.2k views
ADD COMMENT
1
Entering edit mode
10.7 years ago
Neilfws 49k

The documentation (PDF) for easyRNAseq states that "memory will scale linearly with the number and size of read libraries (e.g. bam files)". So you need roughly the same memory as BAM file size, then a little more.

I don't think there's a fix, other than to work on a machine with enough RAM.

ADD COMMENT
1
Entering edit mode

As an aside, it's crazy that it's loading the whole BAM into memory. There's no need to do that, at least for this simple region-counting analysis. Many (hopefully most) of the other tools to do this will process the reads via streaming and have minimal memory requirements. So I'd recommend switching to another tool, like htseq-count or bedtools multicov.

ADD REPLY
1
Entering edit mode

That's more of an R issue than an easyRNAseq issue. R has the unfortunate habit of handling alignments by first reading them all into memory (likely in part due to the poor performance of loops).

ADD REPLY
0
Entering edit mode

I just thought that this package (or this function) can solve the .bam file reading in a more sophisticated way. thanks!

ADD REPLY
0
Entering edit mode

I knew the htseq-count, but in my research team we want to try all modules/programs for calculating the read count. I'll check the bedtools multicov, thanks!

ADD REPLY
0
Entering edit mode

Could the BAM file be split up and worked on in smaller pieces?

ADD REPLY

Login before adding your answer.

Traffic: 1878 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6