Remove contamination across cell clusters
1
0
Entering edit mode
3.2 years ago

I've got an scRNA-seq dataset filtered for immune cells. After applying the Seurat workflow I've discovered that some non-immune markers are expressed by cell subsets in almost all the clusters. See heat map:

enter image description here

I'm fairly certain that those cells are contamination. How would you filter those out based on expression values?

Thanks in advance!

clustering single-cell • 3.4k views
ADD COMMENT
0
Entering edit mode

Why do you think those are contaminating cells?

ADD REPLY
0
Entering edit mode

How can you say that these are contaminated? For which cell types the selections are done in scRNA-seq and which non-immune cell types you are getting?

ADD REPLY
0
Entering edit mode

rpolicastro @EagleEye that's a good question! The cells were sorted by CD45+ before sequencing. Some epithelial cells were detected nonetheless, based on EPCAM+. Now, these cells expressing salivary markers - I'm not 100% sure they're contaminating. It might as well be that the immune cells in the tissue are expressing some tissue-specific genes. But my original question is more about the technical process of removing these cells if they indeed are contamination.

ADD REPLY
0
Entering edit mode
3.2 years ago
firestar ★ 1.6k

There are many ways to do this. Here is a working example.

library(Seurat)
pbmc_small <- pbmc_small %>%
              NormalizeData() %>%
              FindVariableFeatures() %>%
              ScaleData()

VlnPlot(pbmc_small,features="PPBP")

enter image description here

Let's say we want to remove cells expressing PPBP above 5.5. Here is one way:

# get positions of genes that match condition
x <- which(GetAssayData(pbmc_small)["PPBP",]<5.5)
# subset to new object
pbmc_small1 <- subset(pbmc_small,cells=x)

VlnPlot(pbmc_small1,features="PPBP")

enter image description here

Here is another way. This way, you can plot your good and bad cells on a UMAP for example before deciding to remove it.

# save good and bad cells as a new metadata variable
pbmc_small$good <- GetAssayData(pbmc_small)["PPBP",]<5.5
# UMAPPlot(pbmc_small,group.by="good")
pbmc_small1 <- subset(pbmc_small,subset=good==TRUE)
ADD COMMENT
0
Entering edit mode

Thanks, this makes sense! I was also wondering if there's some semi-automated threshold detection method that is used in practice.

ADD REPLY

Login before adding your answer.

Traffic: 1996 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6