Hi,
When reading this answer take into account two point: (1) I never used cacoa
and (2) I don't have access to your Seurat
object, thus, I'll make some assumptions.
I tried to install cacoa
inside a container (in a server), but I didn't succeed. I installed it locally instead.
I adapted your code to a data set that I used for a tutorial and it seems to work, with the exception of the second function (estimateExpressionShiftMagnitudes()
) due to low cell numbers (see error below) since I had to down sample this data set to run it locally:
# Packages
library("dplyr")
library("Seurat")
library("cacoa")
# Download data
options(timeout = 2000)
data.down <- "https://zenodo.org/record/6807707/files/GSE173303.tar.gz?download=1"
tar.file <- "GSE173303.tar.gz"
download.file(url = data.down, destfile = tar.file)
untar(tarfile = tar.file)
# Import data
seu <- readRDS(file = "GSE173303/objects/seu.rds")
# Check meta data
head(seu@meta.data)
# Downsample data set 1k
Idents(seu) <- "cell.types.orig"
seu <- subset(seu, downsample = 50)
saveRDS(seu, "seu.rds")
seu <- readRDS( "seu.rds")
# Params
sample.groups <- condition.samples %>% .$control.orig %>% `names<-`(condition.samples$orig.ident)
cell.groups <- seu$cell.types.orig
sample.per.cell <- seu$orig.ident
ref.level <- "Blood"
target.level <- "SF"
# Build graph
seu <- FindNeighbors(seu)
graph.name <- "RNA_snn"
# Initialize cacoa
cao <- Cacoa$new(seu, sample.groups=sample.groups, cell.groups=cell.groups, sample.per.cell=sample.per.cell,
ref.level=ref.level, target.level=target.level, graph.name=graph.name)
# Run cacoa
cao$estimateCellLoadings()
cao$estimateExpressionShiftMagnitudes() # failed
# Plot
cao$plotCellLoadings(show.pvals=FALSE)
Filtering data...
Excluding cell types Activated CD4, Activated CD8, CD16- NK and NKT, CD16+ NK and NKT, CD38- CD8 T, CD38+ CD8 T, Classical monocytes, CX3CR1hi Effector CD8, CX3CR1lo Effector CD8, Cycling T,
Effector CD8, Gamma-Delta T, mDC, Megakaryocytes, Memory B, Naive B, Naive T, Neutrophil, Non-classical monocytes, pDC, Plasmablasts, Synovial cells, Treg that don't have enough samples
Error in cbind(mtx, ext.mtx)[, col.names, drop = FALSE] :
invalid or not-yet-implemented 'Matrix' subsetting
Answering your questions below:
Has anybody tried to run Cacoa with Seurat object?
No, I didn't until today.
Can someone explain sample.per.cell argument, what it contains?
Based on cacoa
github repository (see github):
sample.per.cell: vector with sample labels per cell named with cell ids
This is just a character vector of the same length as number of cells with the sample identity of every cell, i.e., to which sample a cell belongs to. This character vector should be named with cell ids.
For the example that I gave above, it is:
> head(sample.per.cell, 5)
62_AAACCTGGTAATAGCA 62_AAGGTTCTCGTCCGTT 62_ACATACGAGGCCATAG 62_ACGGGCTGTAGGCTGA 62_AGCGGTCTCGTCGTTC
62 62 62 62 62
Levels: 62 62B 63 63B 76 76B A40 A40B A56 A56B
These are the samples - 62, 62B, 63, 63B, 76, 76B, A40, A40B, A56, A56B (above I'm just printing the sample identity of the first six cells) which belong to sample 62.
what does one extract the embedding from Seurat object ?
You can use the function Embeddings()
(see docs).
I believe the error is related with the sample.groups
and graph.name
provided.
In sample.groups
you've provided a vector of condition/group per cell named with cell ids.
The github repository clearly states:
sample.groups: vector with condition labels per sample named with sample ids
It is condition/group labels per sample and not cell. In the example I gave above this corresponds to:
> sample.groups
62 62B 63 63B 76 76B A40 A40B A56 A56B
"SF" "Blood" "SF" "Blood" "SF" "Blood" "SF" "Blood" "SF" "Blood"
The vector of conditions, i.e., SF
or Blood
, named with samples, i.e., the name of the vector elements are the samples you've (62, 62B, etc).
Then, the graph.name
I'm not sure if it is correct. You can check with the following command:
>Graphs(seu)
"RNA_nn" "RNA_snn"
This gives a list of two elements in my case and I believe the second one is the graph, i.e., "RNA_snn"
.
I hope this helps,
António
Hi,
Please add the command(s) that you've tried by editing your post. That might help to give more context with the issue you're facing.
Best,
António
I did create a graph name while running FindNeighbors()
The cao object is successfully created but i am unable to run these commands
thank you for all of your help. This worked beautifully!
You're welcome.
I would still recommend that you check the answer to your post in the
cacoa
github repository. Be aware I never used the software before and, as such, I made assumptions.Whenever you want to add a comment, please:
I tried to relocate your reply and, unfortunately, it relocated it to your question post.
You can can accept or upvote the answer if it help to solve your issue.
Best,
António
sure, noted.