How to use scanpy for integration of a single multi-dataset aggregated file
1
0
Entering edit mode
2.1 years ago

I've used Seurat extensively for my analysis but I'm looking to switch everything to python for website-hosting purposes (it's far too large a dataset for RShiny). In Seurat, I'm able to take my single features, barcode and matrix fileset and split the datasets via metadata tags

eg.

   #add barcode metadata - extract sample ID from barcode by strpsplit on '-' and extracting second element of resulting list (here the indentifier is appended after '-' to the barcode)
datasets <- sapply(strsplit(rownames(sc_obj@meta.data), split = '-'), "[[",2)
# add barcode metadata - supply dataset ID as additional metadata
sc_obj <- AddMetaData(object = sc_obj, metadata = data.frame(datasets = datasets, row.names = rownames(sc_obj@meta.data)))
## it's important that the datasets are listed in the order that they come from the cellranger output
sc_obj@meta.data$datasets = dplyr::recode(sc_obj@meta.data$datasets,
                                       "1"="Wildtype",
                                       "2"="Mutant", "3" = "Wildtype2", "4" = "Mutant2")

Once I have the metadata tag, I'm able to do quality control and then individually Normalize and FindVariableFeatures before using a list of the objects for standard integration.

sc_objQ.list <- SplitObject(sc_objQ, split.by = 'datasets')
for (i in 1:length(sc_objQ.list)) {
  sc_objQ.list[[i]] <- NormalizeData(sc_objQ.list[[i]], verbose = FALSE)
  sc_objQ.list[[i]] <- FindVariableFeatures(sc_objQ.list[[i]], selection.method = "vst",
                                       nfeatures = 2000, verbose = FALSE)
}

Every tutorial I've seen for scanpy requires that you have individual objects which you then integrate together, ie. two non-aggregated datasets.

Is there a way I can achieve the same result with scanpy as I have with seurat? I'm used to R language and not familiar enough with python/scanpy to figure out the same metadata tagging and splitting.

Any help or direction towards a resource would be very helpful, thank you!

scanpy integration scRNAseq • 1.4k views
ADD COMMENT
1
Entering edit mode
2.1 years ago

In case anyone was interested in a very simple splitting technique which makes a lot easier to handle, this was the issue I was having just due to my coding illiteracy:

sample = []
for i in adata.obs_names:
    if i[-1] == ("1"):
        sample.append("WT1")
    elif i[-1] == ("2"):
        sample.append("MUT1")
    elif i[-1] == ("2"):
        sample.append("WT2")
    else:
        sample.append("MUT2")

adata.obs["Sample"] = sample
print(adata.obs)
ADD COMMENT

Login before adding your answer.

Traffic: 2599 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6