Hi everyone,
I have a Seurat object that I want to subset based on a csv file I have of "good" cell barcodes (56,020 cells) that have already been through QC. The list of cell barcodes have the following sample prefixes 1_MI1_S3 2_MI1_S4 3_C_MI2_S1 4_D_MI2_S2 4_MI1_S5 5_MI1_S6 7_MI1_S7
I read in the csv file of good cells as follows
all_good_cells_rhemac10 <- read.csv("master_list_good_cells_rhemac10.csv", header = FALSE)
head(all_good_cells_rhemac10)
V1
1 1_MI1_S3_AAACCCAAGAGACAAG-1
2 1_MI1_S3_AAACCCAAGAGTATAC-1
3 1_MI1_S3_AAACCCAAGTCCCGAC-1
4 1_MI1_S3_AAACCCAAGTGAGGCT-1
5 1_MI1_S3_AAACCCACAGAAGCTG-1
6 1_MI1_S3_AAACCCACATCTATCT-1
The seurat object I made with a raw counts matrix (69,801 cells) has cell barcode row names with the same sample prefixes. The original identity of each cell corresponds to the sample prefix (side question: Is there a way to change the orig.ident column to distinguish between the 4_D_MI2_S2 and the 4_MI1_S5 sample)?
Sample name (orig.ident column in seurat object meta.data) 1_MI1_S3 (1) 2_MI1_S4 (2) 3_C_MI2_S1 (3) 4_D_MI2_S2 (4) 4_MI1_S5 (4) 5_MI1_S6 (5) 7_MI1_S7 (7)
head(seurat.obj_combined@meta.data)
orig.ident nCount_RNA nFeature_RNA
3_C_MI2_S1_AAACCCAAGCTAGATA-1 3 2133 1348
3_C_MI2_S1_AAACCCAAGCTCACTA-1 3 1443 999
3_C_MI2_S1_AAACCCACAACAAGAT-1 3 3058 1630
3_C_MI2_S1_AAACCCACACTGCGAC-1 3 1214 877
3_C_MI2_S1_AAACCCAGTCGGTAAG-1 3 1513 998
3_C_MI2_S1_AAACCCAGTTTGTGGT-1 3 2238 1397
I have tried several methods to filter the seurat object using the subset() function but I've been running into errors
I tried adding the V1 column of the seurat object meta.data before using the subset() function but that didn't work
seurat.obj_combined[["V1"]] <- all_good_cells_rhemac10[["V1"]]
seurat.obj_combined<-subset(seurat.obj_combined, subset = all_good_cells_rhemac10$V1)
Do the cell barcodes in the seurat object and the csv file need to be in the same order? Is there a parameter I need to use in the subset function in order to subset the seurat object? There doesn't seem to be a cell barcodes parameter to use, but there is when I use WhichCells.
When I use that function, it does return a list of subsetted barcodes, but not a subsetted Seurat object which is what I want.
WhichCells(seurat.obj_combined, cells = all_good_cells_rhemac10)
1 1_MI1_S3_AAACCCAAGAGACAAG-1
2 1_MI1_S3_AAACCCAAGAGTATAC-1
3 1_MI1_S3_AAACCCAAGTCCCGAC-1
4 1_MI1_S3_AAACCCAAGTGAGGCT-1
5 1_MI1_S3_AAACCCACAGAAGCTG-1
6 1_MI1_S3_AAACCCACATCTATCT-1
7 1_MI1_S3_AAACCCACATGGAGAC-1
8 1_MI1_S3_AAACCCAGTCAAGCGA-1
9 1_MI1_S3_AAACCCAGTCCTATAG-1
10 1_MI1_S3_AAACCCAGTCTGGTTA-1
[ reached 'max' / getOption("max.print") -- omitted 55020 rows ]
Hi, thanks so much!