Seurat to Dataframe with ONLY counts, cells, and cluster labels?
1
0
Entering edit mode
3.9 years ago
Kind Katydid ▴ 30

I'm trying to run MDSeq, which appears to require a plain old data frame - counts and cells, and with labels for conditions (in my case, it'll be clusters found from Seurat's clustering).

I'm completely lost as to what function, if any, could strip all data/metadata except for the 3 above, and then convert it into a dataframe?

Alternatively, I am using the below script to split it into dataframes for each cluster, could/should I combine these and retain/add cluster labels? Seems less elegant than the other approach.

cluster_list <- levels(data_transcripts_seurat@active.ident)

for(cluster in cluster_list) {
  name <- paste("data_cluster_", cluster, sep="")
  clustersubset <- subset(data_transcripts_seurat, idents=cluster, slot="counts")
  assign(name, as.data.frame(clustersubset@assays$SCT@counts))
}
R RNA-Seq Seurat • 9.6k views
ADD COMMENT
2
Entering edit mode

Why not start with data_transcripts_seurat@assays$SCT@counts?

Something like:

counts.df <- as.data.frame(data_transcripts_seurat@assays$SCT@counts)

If you need the cells to be rows, you can use the t() function:

counts.df <- data_transcripts_seurat@assays$SCT@counts %>% as.matrix %>% t %>% as.data.frame
ADD REPLY
0
Entering edit mode

Thanks! That's a much better way to do it. Could you please also provide guidance on how the cluster assignment can also be transferred (correctly) to this dataframe?

ADD REPLY
0
Entering edit mode

what does data_transcripts_seurat@active.ident get you? Shouldn't those be the cluster assignments per cell? [Disclaimer: I do not use Seurat, so I'm basing this off your initial code]

ADD REPLY
0
Entering edit mode

Hi, yes I think it is! I did not realise this before (also very new to Seurat, and transcriptomics in general). How can I add this to counts.df? Is it with the attributes function?

Also, should I be concerned about the order of cells being different in counts.df and data_transcripts_seurat@active.ident? Are there ways to make sure they are matched?

ADD REPLY
1
Entering edit mode
3.5 years ago
Pratik ★ 1.1k

4 months late I know... but anyways... I was trying to do something similar. Google helped and sort of pieced together this puzzle:

library(dplyr)
counts.df <- data_transcripts_seurat@assays$RNA@counts %>% as.matrix %>% t %>% as.data.frame
counts.df <- tibble::rownames_to_column(counts.df, "cellnames")
clusterassignemnts <- data.frame(data_transcripts_seurat@active.ident)
clusterassignemnts <- tibble::rownames_to_column(clusterassignemnts, "cellnames")
counts.df <- merge(clusterassignemnts, counts.df, by = "cellnames")
rownames(counts.df) <- counts.df$cellnames
counts.df$cellnames <- NULL
colnames(counts.df)[1] <- c("clusters")

Note: I did choose RNA counts versus SCT, you could play around with that I guess.

Also there is probably a better simpler way, this is what i found

Hope this helps!

ADD COMMENT
3
Entering edit mode

Thanks! That's a really good way to do what I intended to,

For me, however, I gave up and decided to just put the cluster into the object name as opposed to a column,

cluster_index <- levels(counts_all@active.ident) # counts_all is my main Seurat object

for(i in cluster_index) {
  name <- paste0("counts_", i)
  clustersubset <- subset(counts_all, idents=i, slot="counts")
  assign(name, as.data.frame(clustersubset@assays$RNA@counts))
  rm(name, clustersubset)
}
ADD REPLY

Login before adding your answer.

Traffic: 1838 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6