Entering edit mode
2.9 years ago
Aaron
▴
30
I'm processing data from a PDX/barnyard experiment in Seurat. I'm trying to clean the cell-cycle aspects of the data by running CellCycleScoring
but I'm not quite sure how to input both the mice G2/M and S phase genes, and the human G2/M and S phase genes into the function. This is what I've done below:
CTRL_seurat_phase <- NormalizeData(CTRL)
CTRL_seurat_phase <- CellCycleScoring(CTRL, s.features=c("mouse_s_genes", "human_s_genes"), g2m.features=c("mouse_g2m_genes", "human_g2m_genes"), set.ident=TRUE)
CTRL_seurat_phase <- FindVariableFeatures(CTRL_seurat_phase, selection.method="vst", nfeatures=3000, verbose=FALSE)
CTRL_seurat_phase <- ScaleData(CTRL_seurat_phase)
CTRL_seurat_phase <- RunPCA(CTRL_seurat_phase)
DimPlot(CTRL_seurat_phase, reduction="pca", group.by="Phase", split.by="Phase")
The CellCycleScoring
line returns an error. How would I go about controlling for the effects of both the mouse and human cell cycle genes?
For reference, this is how I got the cell cycle genes for e.g. humans (essentially same for mice):
cc_file <- getURL("https://raw.githubusercontent.com/hbc/tinyatlas/master/cell_cycle/Homo_sapiens.csv")
human_cell_cycle_genes <- read.csv(text=cc_file)
# Connect to AnnotationHub
ah <- AnnotationHub()
# Access to Ensembl database for organism
ahDb <- query(ah,
pattern = c("Homo Sapiens", "EnsDb"),
ignore.case=TRUE)
# Acquire the latest annotation files
id <- ahDb %>%
mcols() %>%
rownames() %>%
tail(n=1)
# Download the appropriate Ensembld database
edb <- ah[[id]]
# Extract gene-level information from database
annotations <- genes(edb,
return.type="data.frame")
# Select annotations of interest
annotations <- annotations %>%
dplyr::select(gene_id, gene_name, seq_name, gene_biotype, description)
# Get gene names for Ensembl IDs for each gene
human_cell_cycle_markers <- dplyr::left_join(human_cell_cycle_genes, annotations, by = c("geneID" = "gene_id"))
# Acquire the S phase genes
human_s_genes <- human_cell_cycle_markers %>%
dplyr::filter(phase=="S") %>%
pull("gene_name")
# Acquire the G2M phase genes
human_g2m_genes <- human_cell_cycle_markers %>%
dplyr::filter(phase=="G2/M") %>%
pull("gene_name")
Any reason you aren't just removing the mouse cells? Are they part of the experiment? How are you getting the counts? If using cellranger and their human/mouse reference, your gene names will have the genome identifier attached ("hg38" or "mm10" or whatever) which will have to be taken into account.