how to run limma for large datasets
1
0
Entering edit mode
9 months ago
Bioinfonext ▴ 470

Hi,

I am trying to do association analysis using R packages Limma , but it is keep failing due to out of memory issue, even though I have allocated more memory in slurm job script; please suggest how to run it with MPI or any other way?

Slurm job script details;

#SBATCH --job-name=limma5 # Job name
#SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --ntasks=20
#SBATCH --time=30:00:00 # Time limit hrs:min:sec
#SBATCH --partition=k2-himem
#SBATCH --mem-per-cpu=199G

module load apps/R/4.3.0/gcc-9.3.0+lapack-3.9.0+blas-3.8.0
Rscript script.r

script.r:

#model matrix
var1<-model.matrix(~variable_of_interest + sex + Age +CD8T +CD4T +NK + Bcell +Mono+smoking, data=targets2)

##############
fit1<-lmFit(mval,var1)

fit1<-eBayes(fit1,trend=TRUE, robust=TRUE)
probe1<-topTable(fit1,adjust="BH",coef=2,num=Inf)
sig.probe1<-probe1[which(probe1$adj.P.Val<=0.05),]

write.table(sig.probe1,file="limma.model.significant.output.txt",sep="\t",quote=TRUE)
write.table(probe1,file="limma.model.output.txt",sep="\t",quote=TRUE)

Memory details;

14033752 3052771 k2-himem OUT_OF_ME+ 3 20 smp[106,109,11+ 3980G billing=20,cpu=20,mem=3980G,node=3
14033752.ba+ OUT_OF_ME+ 1 8 smp106 166926320+ cpu=8,mem=1592G,node=1
R HPC biostatistics statistics • 665 views
ADD COMMENT
0
Entering edit mode

You should show your R script.

ADD REPLY
0
Entering edit mode

thanks I have added script now. same script work well for smaller datasets, only shows out of memory issue with larger datasets.

ADD REPLY
1
Entering edit mode
5 weeks ago
Gordon Smyth ★ 7.7k

See https://support.bioconductor.org/p/9157285/

After fixing OP's code, the analysis runs in a couple of minutes even on a laptop.

ADD COMMENT

Login before adding your answer.

Traffic: 1552 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6