Entering edit mode
4.3 years ago
Assa Yeroslaviz
★
1.9k
Hi, I'm getting the following error, when trying to run my file
>>> doublet_scores, predicted_doublets = scrub.scrub_doublets(min_counts=2,
... min_cells=3,
... min_gene_variability_pctl=85,
... n_prin_comps=30)
Preprocessing...
/home/scrublet/helper_functions.py:321: RuntimeWarning: divide by zero encountered in true_divide
w.setdiag(float(target_total) / tots_use)
/home/scrublet/helper_functions.py:252: RuntimeWarning: invalid value encountered in sqrt
CV_input = np.sqrt(b);
Simulating doublets...
/home/scrublet/helper_functions.py:321: RuntimeWarning: divide by zero encountered in true_divide
w.setdiag(float(target_total) / tots_use)
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
File "/home/scrublet/scrublet.py", line 224, in scrub_doublets
pipeline_zscore(self)
File "/home/scrublet/helper_functions.py", line 65, in pipeline_zscore
self._E_sim_norm = np.array(sparse_zscore(self._E_sim_norm, gene_means, gene_stdevs))
File "/home/scrublet/helper_functions.py", line 173, in sparse_zscore
return sparse_multiply((E - gene_mean).T, 1/gene_stdev).T
File "/home/scrublet/helper_functions.py", line 164, in sparse_multiply
return w * E
File "/home/scipy/sparse/base.py", line 518, in __mul__
result = self._mul_multivector(np.asarray(other))
File "/home/scipy/sparse/base.py", line 536, in _mul_multivector
return self.tocsr()._mul_multivector(other)
File "/home/scipy/sparse/compressed.py", line 485, in _mul_multivector
dtype=upcast_char(self.dtype.char, other.dtype.char))
MemoryError: Unable to allocate 167. GiB for an array with shape (1651, 13589760) and data type float64
>>>
The Tools was ran within a conda environment (if this makes any difference).
my data set contains Counts matrix shape: 6794880 rows, 31053 columns Number of genes in gene list: 31053
Is there a way to deal with this problem?
thanks
Likely not. Looks like the program wants to allocate 167 GiB of memory. Does it work with smaller datasets?
I would assume so, but it wouldn't help me, if I can't run it on a normal single-cell sparse matrix
I do have enough memory on m server though. This shouldn't be a problem.
Have you tried increasing amount of allocated memory beyond 167G + 10-20%?
scrublet also says:
no not yet. As I don't have any memory restrictions and it should be able to use everything on the server. I can't understand why it is restricted to begin with.
This is only one sample