Question

Seurat to Dataframe with ONLY counts, cells, and cluster labels?

0

Entering edit mode

3.9 years ago

Kind Katydid ▴ 30

I'm trying to run MDSeq, which appears to require a plain old data frame - counts and cells, and with labels for conditions (in my case, it'll be clusters found from Seurat's clustering).

I'm completely lost as to what function, if any, could strip all data/metadata except for the 3 above, and then convert it into a dataframe?

Alternatively, I am using the below script to split it into dataframes for each cluster, could/should I combine these and retain/add cluster labels? Seems less elegant than the other approach.

cluster_list <- levels(data_transcripts_seurat@active.ident)

for(cluster in cluster_list) {
  name <- paste("data_cluster_", cluster, sep="")
  clustersubset <- subset(data_transcripts_seurat, idents=cluster, slot="counts")
  assign(name, as.data.frame(clustersubset@assays$SCT@counts))
}

R RNA-Seq Seurat • 9.6k views

ADD COMMENT • link 3.5 years ago by Kind Katydid ▴ 30

2

Entering edit mode

Why not start with data_transcripts_seurat@assays$SCT@counts?

Something like:

counts.df <- as.data.frame(data_transcripts_seurat@assays$SCT@counts)

If you need the cells to be rows, you can use the t() function:

counts.df <- data_transcripts_seurat@assays$SCT@counts %>% as.matrix %>% t %>% as.data.frame

ADD REPLY • link 3.9 years ago by Friederike 9.0k

0

Entering edit mode

Thanks! That's a much better way to do it. Could you please also provide guidance on how the cluster assignment can also be transferred (correctly) to this dataframe?

ADD REPLY • link 3.9 years ago by Kind Katydid ▴ 30

0

Entering edit mode

what does data_transcripts_seurat@active.ident get you? Shouldn't those be the cluster assignments per cell? [Disclaimer: I do not use Seurat, so I'm basing this off your initial code]

ADD REPLY • link 3.9 years ago by Friederike 9.0k

0

Entering edit mode

Hi, yes I think it is! I did not realise this before (also very new to Seurat, and transcriptomics in general). How can I add this to counts.df? Is it with the attributes function?

Also, should I be concerned about the order of cells being different in counts.df and data_transcripts_seurat@active.ident? Are there ways to make sure they are matched?

ADD REPLY • link 3.9 years ago by Kind Katydid ▴ 30

score 1 · Answer 1 · 2021-05-28

4 months late I know... but anyways... I was trying to do something similar. Google helped and sort of pieced together this puzzle:

library(dplyr)
counts.df <- data_transcripts_seurat@assays$RNA@counts %>% as.matrix %>% t %>% as.data.frame
counts.df <- tibble::rownames_to_column(counts.df, "cellnames")
clusterassignemnts <- data.frame(data_transcripts_seurat@active.ident)
clusterassignemnts <- tibble::rownames_to_column(clusterassignemnts, "cellnames")
counts.df <- merge(clusterassignemnts, counts.df, by = "cellnames")
rownames(counts.df) <- counts.df$cellnames
counts.df$cellnames <- NULL
colnames(counts.df)[1] <- c("clusters")

Note: I did choose RNA counts versus SCT, you could play around with that I guess.

Also there is probably a better simpler way, this is what i found

Hope this helps!