Super long time when running RSoptSC package
0
0
Entering edit mode
14 months ago
alwayshope ▴ 40

Dear guys,

Anyone encounter the issue that it takes very long time (several days) to run commands in RSoptSC; ClusterCells run several days and still not finished, and already used doParallel package trying to run the command in parallel.

Thank you very much for your guidance!

RSoptSC single-cell • 807 views
ADD COMMENT
1
Entering edit mode

Show as much of your code as you can. With time intensive functions, always run the function on a subset of the data to evaluate how long things take as well as ensuring your dataset fits the function's expectations - you don't want to find out a week from now that your dataset is missing a column that the function needs, and you'll need to re-run the entire thing because you forgot a 2 minute pre-processing step.

ADD REPLY
0
Entering edit mode

Thanks a lot!

It's the same data structure as the tutorial of the input, dgCMatrix, and the input is near 20,000 genes x 15,000 cells. filtered_data may run 10h, S <- SimilarityM() and RepresentationMap() can take 1-2 h to run, while the ClusterCells() can take more than 3 days and still not finish.

library(RSoptSC)
logdata <- log10(input_matrix_sc + 1)
gene_expression_threshold <- 0.03
n_features <- 3000
filtered_data<- SelectData(logdata, gene_expression_threshold, n_features)

S <- SimilarityM(lambda = 0.05, 
                 data = filtered_data$M_variable,
                 dims = 3,
                 pre_embed_method = 'tsne',
                 perplexity = 20, 
                 pca_center = TRUE, 
                 pca_scale = TRUE)


low_dim_mapping <- RepresentationMap(similarity_matrix = S$W,
                                     flat_embedding_method = 'tsne',
                                     join_components = TRUE,
                                     perplexity = 35,
                                     theta = 0.5,
                                     normalize = FALSE,
                                     pca = TRUE,
                                     pca_center = TRUE,
                                     pca_scale = TRUE,
                                     dims = 2,
                                     initial_dims = 2)

clusters <- ClusterCells(similarityMatrix = S$W, n_comp = 15, .options='p')
H <- clusters$H
labels <- clusters$labels
n_clusters <- length(unique(clusters$labels))
ADD REPLY
1
Entering edit mode

You should email the authors and point them to this post. That might help get a solution faster

ADD REPLY
0
Entering edit mode

Sure, thank you very much! Appreciate!

ADD REPLY

Login before adding your answer.

Traffic: 2109 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6