Filtering data...

Question

Cacoa with Seurat analysis

0

Entering edit mode

13 months ago

Aditya • 0

Hi everyone,

Has anybody tried to run Cacoa with Seurat object? Can someone explain sample.per.cell argument, what it contains?

what does one extract the embedding from Seurat object ? I get an error saying dimensions should be positive with Cacao.

I have also posted the GitHub issue here https://github.com/kharchenkolab/cacoa/issues/48

Thank you in advance,
Adi

cacoa differential-abundance Seurat • 1.6k views

ADD COMMENT • link updated 11 weeks ago by GenoMax 147k • written 13 months ago by Aditya • 0

0

Entering edit mode

Hi,

Please add the command(s) that you've tried by editing your post. That might help to give more context with the issue you're facing.

Best,

António

ADD REPLY • link 13 months ago by antonioggsousa 3.2k

0

Entering edit mode

seu = seurat.object

sample.groups = seu$group 

cell.groups = seu$celltype

sample.per.cell = seu$orig.ident

ref.level = “Untreated”

target.level = “Treated”

cao <- Cacoa$new(seu,
 sample.groups=sample.groups,
 cell.groups=cell.groups, 
sample.per.cell=sample.per.cell, 
ref.level=ref.level,
 target.level=target.level, 
graph.name= “SNN1”)

I did create a graph name while running FindNeighbors()

The cao object is successfully created but i am unable to run these commands

cao$estimateCellLoadings()

**Error**- dimensions should  to be positive 



cao$estimateExpressionShiftMagnitudes()

**Error**- 2 factor level is needed

ADD REPLY • link updated 13 months ago by GenoMax 147k • written 13 months ago by Aditya • 0

0

Entering edit mode

thank you for all of your help. This worked beautifully!

ADD REPLY • link 13 months ago by Aditya • 0

1

Entering edit mode

You're welcome.

I would still recommend that you check the answer to your post in the cacoa github repository. Be aware I never used the software before and, as such, I made assumptions.

Whenever you want to add a comment, please:

Use the ADD COMMENT or ADD REPLY buttons embedded in each post to comment, to ask for clarifications, to request more details, or respond to a previous answer or comment.

I tried to relocate your reply and, unfortunately, it relocated it to your question post.

You can can accept or upvote the answer if it help to solve your issue.

Best,

António

ADD REPLY • link 13 months ago by antonioggsousa 3.2k

0

Entering edit mode

sure, noted.

ADD REPLY • link 13 months ago by Aditya • 0

score 0 · Answer 1 · 2023-10-09

Hi,

When reading this answer take into account two point: (1) I never used cacoa and (2) I don't have access to your Seurat object, thus, I'll make some assumptions.

I tried to install cacoa inside a container (in a server), but I didn't succeed. I installed it locally instead.

I adapted your code to a data set that I used for a tutorial and it seems to work, with the exception of the second function (estimateExpressionShiftMagnitudes()) due to low cell numbers (see error below) since I had to down sample this data set to run it locally:

# Packages
library("dplyr")
library("Seurat")
library("cacoa")

# Download data
options(timeout = 2000)
data.down <- "https://zenodo.org/record/6807707/files/GSE173303.tar.gz?download=1"
tar.file <- "GSE173303.tar.gz"
download.file(url = data.down, destfile = tar.file)
untar(tarfile = tar.file)

# Import data 
seu <- readRDS(file = "GSE173303/objects/seu.rds")

# Check meta data
head(seu@meta.data)

# Downsample data set 1k
Idents(seu) <- "cell.types.orig"
seu <- subset(seu, downsample = 50)
saveRDS(seu, "seu.rds")    
seu <- readRDS( "seu.rds")

# Params
sample.groups <- condition.samples %>% .$control.orig %>% `names<-`(condition.samples$orig.ident)
cell.groups <- seu$cell.types.orig
sample.per.cell <- seu$orig.ident
ref.level <- "Blood"
target.level <- "SF"

# Build graph
seu <- FindNeighbors(seu)
graph.name <- "RNA_snn"

# Initialize cacoa
cao <- Cacoa$new(seu, sample.groups=sample.groups, cell.groups=cell.groups, sample.per.cell=sample.per.cell, 
             ref.level=ref.level, target.level=target.level, graph.name=graph.name)

# Run cacoa
cao$estimateCellLoadings()
cao$estimateExpressionShiftMagnitudes() # failed

# Plot
cao$plotCellLoadings(show.pvals=FALSE)

Filtering data...

Excluding cell types Activated CD4, Activated CD8, CD16- NK and NKT, CD16+ NK and NKT, CD38- CD8 T, CD38+ CD8 T, Classical monocytes, CX3CR1hi Effector CD8, CX3CR1lo Effector CD8, Cycling T, Effector CD8, Gamma-Delta T, mDC, Megakaryocytes, Memory B, Naive B, Naive T, Neutrophil, Non-classical monocytes, pDC, Plasmablasts, Synovial cells, Treg that don't have enough samples

Error in cbind(mtx, ext.mtx)[, col.names, drop = FALSE] : invalid or not-yet-implemented 'Matrix' subsetting

Answering your questions below:

Has anybody tried to run Cacoa with Seurat object?

No, I didn't until today.

Can someone explain sample.per.cell argument, what it contains?

Based on cacoa github repository (see github):

sample.per.cell: vector with sample labels per cell named with cell ids

This is just a character vector of the same length as number of cells with the sample identity of every cell, i.e., to which sample a cell belongs to. This character vector should be named with cell ids.

For the example that I gave above, it is:

> head(sample.per.cell, 5)
62_AAACCTGGTAATAGCA 62_AAGGTTCTCGTCCGTT 62_ACATACGAGGCCATAG 62_ACGGGCTGTAGGCTGA 62_AGCGGTCTCGTCGTTC 
                 62                  62                  62                  62                  62 
Levels: 62 62B 63 63B 76 76B A40 A40B A56 A56B

These are the samples - 62, 62B, 63, 63B, 76, 76B, A40, A40B, A56, A56B (above I'm just printing the sample identity of the first six cells) which belong to sample 62.

what does one extract the embedding from Seurat object ?

You can use the function Embeddings() (see docs).

I believe the error is related with the sample.groups and graph.name provided.

In sample.groups you've provided a vector of condition/group per cell named with cell ids.

The github repository clearly states:

sample.groups: vector with condition labels per sample named with sample ids

It is condition/group labels per sample and not cell. In the example I gave above this corresponds to:

> sample.groups
 62     62B      63     63B      76     76B     A40    A40B     A56    A56B 
"SF" "Blood"    "SF" "Blood"    "SF" "Blood"    "SF" "Blood"    "SF" "Blood"

The vector of conditions, i.e., SF or Blood, named with samples, i.e., the name of the vector elements are the samples you've (62, 62B, etc).

Then, the graph.name I'm not sure if it is correct. You can check with the following command:

>Graphs(seu) 
"RNA_nn"  "RNA_snn"

This gives a list of two elements in my case and I believe the second one is the graph, i.e., "RNA_snn".

I hope this helps,

António