Subsetting Seurat object based on list of cell barcodes
2
0
Entering edit mode
8 months ago
bgbs • 0

Hi everyone,

I have a Seurat object that I want to subset based on a csv file I have of "good" cell barcodes (56,020 cells) that have already been through QC. The list of cell barcodes have the following sample prefixes 1_MI1_S3 2_MI1_S4 3_C_MI2_S1 4_D_MI2_S2 4_MI1_S5 5_MI1_S6 7_MI1_S7

I read in the csv file of good cells as follows

all_good_cells_rhemac10 <- read.csv("master_list_good_cells_rhemac10.csv", header = FALSE)
head(all_good_cells_rhemac10)
                           V1
1 1_MI1_S3_AAACCCAAGAGACAAG-1
2 1_MI1_S3_AAACCCAAGAGTATAC-1
3 1_MI1_S3_AAACCCAAGTCCCGAC-1
4 1_MI1_S3_AAACCCAAGTGAGGCT-1
5 1_MI1_S3_AAACCCACAGAAGCTG-1
6 1_MI1_S3_AAACCCACATCTATCT-1

The seurat object I made with a raw counts matrix (69,801 cells) has cell barcode row names with the same sample prefixes. The original identity of each cell corresponds to the sample prefix (side question: Is there a way to change the orig.ident column to distinguish between the 4_D_MI2_S2 and the 4_MI1_S5 sample)?

Sample name (orig.ident column in seurat object meta.data) 1_MI1_S3 (1) 2_MI1_S4 (2) 3_C_MI2_S1 (3) 4_D_MI2_S2 (4) 4_MI1_S5 (4) 5_MI1_S6 (5) 7_MI1_S7 (7)

head(seurat.obj_combined@meta.data)
                              orig.ident nCount_RNA nFeature_RNA
3_C_MI2_S1_AAACCCAAGCTAGATA-1          3       2133         1348
3_C_MI2_S1_AAACCCAAGCTCACTA-1          3       1443          999
3_C_MI2_S1_AAACCCACAACAAGAT-1          3       3058         1630
3_C_MI2_S1_AAACCCACACTGCGAC-1          3       1214          877
3_C_MI2_S1_AAACCCAGTCGGTAAG-1          3       1513          998
3_C_MI2_S1_AAACCCAGTTTGTGGT-1          3       2238         1397

I have tried several methods to filter the seurat object using the subset() function but I've been running into errors

I tried adding the V1 column of the seurat object meta.data before using the subset() function but that didn't work

seurat.obj_combined[["V1"]] <- all_good_cells_rhemac10[["V1"]]
seurat.obj_combined<-subset(seurat.obj_combined, subset = all_good_cells_rhemac10$V1)

Do the cell barcodes in the seurat object and the csv file need to be in the same order? Is there a parameter I need to use in the subset function in order to subset the seurat object? There doesn't seem to be a cell barcodes parameter to use, but there is when I use WhichCells.

When I use that function, it does return a list of subsetted barcodes, but not a subsetted Seurat object which is what I want.

WhichCells(seurat.obj_combined, cells = all_good_cells_rhemac10)

1    1_MI1_S3_AAACCCAAGAGACAAG-1
2    1_MI1_S3_AAACCCAAGAGTATAC-1
3    1_MI1_S3_AAACCCAAGTCCCGAC-1
4    1_MI1_S3_AAACCCAAGTGAGGCT-1
5    1_MI1_S3_AAACCCACAGAAGCTG-1
6    1_MI1_S3_AAACCCACATCTATCT-1
7    1_MI1_S3_AAACCCACATGGAGAC-1
8    1_MI1_S3_AAACCCAGTCAAGCGA-1
9    1_MI1_S3_AAACCCAGTCCTATAG-1
10   1_MI1_S3_AAACCCAGTCTGGTTA-1
[ reached 'max' / getOption("max.print") -- omitted 55020 rows ]
seurat • 2.8k views
ADD COMMENT
1
Entering edit mode
8 months ago
Ming Tommy Tang ★ 4.5k
index<- rownames(seurat.obj_combined) %in% all_good_cells_rhemac10$V1
seurat.obj_combined[index, ]

should do the trick

ADD COMMENT
0
Entering edit mode

Hi, thanks so much!

ADD REPLY
1
Entering edit mode
8 months ago
fracarb8 ★ 1.7k

subset(seurat.obj_combined, cells = all_good_cells_rhemac10$V1)

You need to pass cells not subset. From the documentation:

subset    Logical expression indicating features/variables to keep
cells, j    A vector of cell names or indices to keep
ADD COMMENT
0
Entering edit mode

Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1873 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6