Question

Error in Heatmap creation with RnBeads

0

Entering edit mode

4.5 years ago

Assa Yeroslaviz ★ 1.9k

When running the exploratory data analysis step (rnb.run.exploratory) I get an error message in the heatmap creation step

2020-06-02 03:26:26    62.5  STATUS             COMPLETED Agglomerative Hierarchical Clustering
2020-06-02 03:26:29    62.5  STATUS             STARTED Clustering Section
2020-06-02 03:26:36    62.5  STATUS                 STARTED Generating Heatmaps
2020-06-02 03:26:38    62.5  STATUS                     STARTED Region type: sites
Xlib: request 18 length 24 would exceed buffer size.

 *** caught segfault ***
address 0x4, cause 'memory not mapped'

Traceback:
 1: dev.off(dev.copy(device = pdf, file = tmp, width = width, height = height,     pointsize = pointsize, paper = "special", ...))
 2: dev2bitmap(fname, type = "pngalpha", height = .Object@height,     width = .Object@width, method = "pdf", ...)
 3: doTryCatch(return(expr), name, parentenv, handler)
 4: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 5: tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
 6: doTryCatch(return(expr), name, parentenv, handler)
 7: tryCatchOne(tryCatchList(expr, names[-nh], parentenv, handlers[-nh]),     names[nh], parentenv, handlers[[nh]])
 8: tryCatchList(expr, classes, parentenv, handlers)
 9: tryCatch(dev2bitmap(fname, type = "pngalpha", height = .Object@height,     width = .Object@width, method = "pdf", ...), warning = function(e) {    if (grepl(" had status 1$", e$message)) {        doerror(e)    }    else if (logger.isinitialized()) {        logger.warning(e$message)    }    else {        invisible(e$message)    }}, error = doerror)
10: convert.f(fname, res = .Object@low.png)
11: .local(.Object, ...)
12: off(rplot)
13: off(rplot)
14: rnb.section.clustering.add.heatmap(report, X, fname, TRUE, clust.result,     sample.ids, locus.colors.cur, sample.colors)
15: rnb.section.clustering(report, rnb.set, clust.results, rinfo,     clust.edited)
16: rnb.step.clustering.internal(rnb.set, report, rinfos)
17: rnb.run.exploratory(rnb.set = rnb.set, dir.reports = report.dir)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection:

I have tried several times with different number of cores, as I thought this might be the problem. It doesn't work.

The parameters for the run are below.

Any ideas what the problems are in this run?

I have done the same run before using the command for the complete run, though several parameters were also not there

the command was

rnb.run.analysis(dir.reports = report.dir, sample.sheet = sample_annotations, 
                 data.dir = bed.dir, data.type = "bs.bed.dir", 
                 build.index = TRUE,save.rdata = TRUE, initialize.reports = TRUE)

The parameters I added in the step-by-step analysis are

# filtering step
    filtering.coverage.threshold      = 5,
    filtering.low.coverage.masking    = TRUE,
    filtering.greedycut               = FALSE,
    filtering.missing.value.quantile  = 0.5,
    filtering.high.coverage.outliers  = TRUE,
# analyzed regions
    region.types = c("genes", "promoters", "tiling1kb", "ensembleRegBuildBPall", "tiling200bp", "gencode22promoters", "gencode22genes"),

Is this error explainable? Does it has anything to do with the filtering steps? or with the new regions I added?

Is there a package missing?

Thanks

Assa

<h5>Parameters for the run</h5>

num.cores = 12
parallel.setup(num.cores)
rnb.options(
    analysis.name        = "Arno A. Bi-Sulphite Seq data analysis local step-by-step",
    email = "yeroslaviz@biochem.mpg.de",
    identifiers.column   = "sampleID",
    replicate.id.column = "treatment", 
    import.bed.style     = "bismarkCov",
    assembly             = "hg38",  
# Finetune color scales and plot themes.
  colors.meth        = c("#EDF8B1","#41B6C4","#081D58"),
    colors.category    = c("#1B9E77","#D95F02","#7570B3","#E7298A","#66A61E",
                           "#E6AB02","#A6761D","#666666","#2166AC","#B2182B",
                           "#00441B","#40004B","#053061"),
    qc.coverage.plots = TRUE,
    qc.coverage.histograms = TRUE,
    qc.coverage.violins = TRUE,
# filtering step
    filtering.coverage.threshold      = 5,
    filtering.low.coverage.masking    = TRUE,
    filtering.greedycut               = FALSE,
    filtering.missing.value.quantile  = 0.5,
    filtering.high.coverage.outliers  = TRUE,
#  Surrogate variables factor analysis (Covariates)
  inference = TRUE,
    inference.targets.sva = "treatment", # Column names in the sample annotation table for which surrogate variable analysis (SVA) should be conducted.
  inference.sva.num.method = "be", #What function should be used to estimate the number of surrogate variables. 
  differential.comparison.columns = c("WT_GID4",    "WT_MAEA",  "GID4_MAEA"), # Column names in the sample annotation table to be used for group definition in the differential methylation analysis.
  differential.comparison.columns.all.pairwise = c("WT_GID4",   "WT_MAEA",  "GID4_MAEA"), # Column names in the sample annotation table to be used for group definition in the differential methylation analysis in which all pairwise comparisons between groups should be conducted (the default is "one vs all" if multiple groups are specified in a column).
  region.types = c("genes", "promoters", "tiling1kb", "ensembleRegBuildBPall", "tiling200bp", "gencode22promoters", "gencode22genes"), # Region types to carry out analysis on (this would remove the analysis for sites [done only to shorten the process.])
  differential.site.test.method = "limma", # Method to be used for calculating p-values on the site level. 
  differential.variability = TRUE, # With this analysis, the variance inside each group is sused to detect differences among them.
  differential.variability.method = "diffVar",
  differential.enrichment.go = TRUE, # whether Gene Ontology (GO)-enrichment analysis is to be conducted on the identified differentially methylated regions.
  differential.enrichment.lola = TRUE, # whether LOLA-enrichment analysis is to be conducted on the identified differentially methylated regions.
  differential.enrichment.lola.dbs = c("${LOLACore}"), # database for LOLA enrichment analysis
  export.to.trackhub = c("bigBed","bigWig"), # create tracks and hub structure
  export.to.csv = TRUE,
# several parameters for better memory management and parallel processing
    disk.dump.big.matrices = TRUE,
    disk.dump.bigff = TRUE,
    logging.disk = TRUE,
    enforce.memory.management = TRUE
    )
theme_set(theme_bw())

BS-Seq RnBeads • 1.6k views

ADD COMMENT • link updated 4.5 years ago by mscherer ▴ 50 • written 4.5 years ago by Assa Yeroslaviz ★ 1.9k

score 2 · Accepted Answer · 2020-06-02

2

Entering edit mode

4.5 years ago

mscherer ▴ 50

Hi!

The very likely reason causing the program crash is a memory issue, i.e. RnBeads does not have sufficient main memory to perform the computations. This can especially occur in the clustering/heatmap part, since the heatmaps can be quite big. You have two potential solutions:

Try it with a new, fresh R-session. If this does not work, try to get access to a bigger machine, with more main memory available.
Deactivate the clustering functionality using exploratory.clustering="none".

Considering the installation issues that you raised, we always recommend to use the installation script that we provide from https://rnbeads.org/installation.html to install RnBeads on your machine. This takes care of installing all the dependencies.

Hope that helps.

ADD COMMENT • link 4.5 years ago by mscherer ▴ 50

0

Entering edit mode

Thanks for the fast response

I don't think this is a RAM problem

$free -h
              total        used        free      shared  buff/cache   available
Mem:           1.0T         23G        325G        3.0G        659G        980G
Swap:            0B          0B          0B

I'll try with a new session and if this not work, I'll deactivate the heatmaps. But the problem is it did work when running the previous command rnb.run.analysis. Why is it different now?

ADD REPLY • link 4.5 years ago by Assa Yeroslaviz ★ 1.9k

0

Entering edit mode

Hi again,

I'm not sure, but it looks strange to me. When I run the separate commands for the clustering it works fine.

> clusterings.sites <- rnb.execute.clustering(rnb.set, region.type="sites") 
2020-06-02 15:24:12    12.1  STATUS Performed clustering on sites using correlation as a distance metric
2020-06-02 15:24:15    12.1  STATUS Performed clustering on sites using manhattan as a distance metric
2020-06-02 15:24:19    12.1  STATUS Performed clustering on sites using euclidean as a distance metric
> clusterings.promoters <-rnb.execute.clustering(rnb.set, region.type="promoters")
...
> clusterings.promoters <-rnb.execute.clustering(rnb.set, region.type="genes")
...
> clusterings.promoters <-rnb.execute.clustering(rnb.set, region.type="tiling1kb")
...
> clusterings.promoters <-rnb.execute.clustering(rnb.set, region.type="tiling200bp")
...
> clusterings.promoters <-rnb.execute.clustering(rnb.set, region.type="gencode22genes")
...
> clusterings.promoters <-rnb.execute.clustering(rnb.set, region.type="ensembleRegBuildBPall")
...
> clusterings.promoters <-rnb.execute.clustering(rnb.set, region.type="gencode22promoters")
...

If I understand the commands from the log files, this is what caused the problems before? Isn't it?

ADD REPLY • link 4.5 years ago by Assa Yeroslaviz ★ 1.9k

0

Entering edit mode

Hi. The issue is probably in the plotting itself, and not in the computations. In the execute commands, there is not plotting involved, while in the run commands there is. Plotting heatmaps can be a challenging tasks on its own, in case there are many CpGs/regions to be plotted. I guess this is what causes the error.

ADD REPLY • link 4.5 years ago by mscherer ▴ 50

0

Entering edit mode

So let's say I would like to have the heatmaps. what command do I use? Or Can I only do it within R?

ADD REPLY • link 4.5 years ago by Assa Yeroslaviz ★ 1.9k

0

Entering edit mode

Good Morning, another thing I can't understand is that the run was done without errors, when running the whole pipeline with the rnb.run.analysis command, so why does it causes problems now?

ADD REPLY • link 4.5 years ago by Assa Yeroslaviz ★ 1.9k

0

Entering edit mode

Hi! to your questions:

For the heatmaps, you can use a combination of RnBeads' functions (e.g meth(rnb.set)) and a heatmap plotting function such as pheatmap, ComplexHeatmap, or heatmap.2
Why it does run for the rnb.run.analysis command, but not for the exploratory module individually, I don't know. I don't event have a clue what the issue might be, except for some unexpected memory issues. I would have to have the dataset at hand to reproduce the error, in order to solve it.

ADD REPLY • link 4.5 years ago by mscherer ▴ 50

0

Entering edit mode

ok, thanks,

The heatmaps via meth() and heatmap. I already managed for some. To make the hatmaps one must make sure, that the matrix contains only complete.cases(), as heatmap.2 can't handle NaN.

For the "sites" data I get the error - Error: cannot allocate vector of size 2974245.6 Gb so I guess there are also some memory problems. It still doesn't explain, why it worked before.

I'll try to re-run the analysis in a step-by-step manner again, but with the same parameters as before (no filtering, etc.). Hopefully I can get something different.

ADD REPLY • link 4.5 years ago by Assa Yeroslaviz ★ 1.9k