Apologise for many posts this weeks. I am wondering in seurat, should I scale all genes for downstream analysis or just some features is okay? I am a bit unclear when it comes to scaling.... I have attached the code here.
Also, how do I know which genes are noise/confounding genes, could you refer me to R script somewhere to look for noise genes and how to filter them out in R? Really appreciate your help so far!
data.filt<- NormalizeData(data.filt)
#cell cycle
data.filt<- CellCycleScoring(data.filt, g2m.features = cc.genes$s.genes, s.features = cc.genes$g2m.genes, set.ident = TRUE)
VlnPlot(data.filt, features = c("S.Score", "G2M.Score"), group.by = "orig.ident",
ncol = 4, pt.size = 0.1)
data.filt<- FindVariableFeatures(data.filt, selection.method = "vst", verbose = FALSE, nfeatures = 2000)
# Scale data
all.genes<- rownames(data.filt)
# Option 1
data.filt<- ScaleData(data.filt,
vars.to.regress = c(
"nCount_RNA","nFeature_RNA", "percent_mito",
"percent_ribo", "S.Score", "G2M.Score"
),
verbose= FALSE)
# Option 2
data.filt<- ScaleData(data.filt,
vars.to.regress = c(
"nCount_RNA","nFeature_RNA", "percent_mito",
"percent_ribo", "S.Score", "G2M.Score"
),
features = rownames(all.genes),
verbose= FALSE)
# PCA
data.filt<- RunPCA(data.filt, verbose = FALSE)
Thanks, António for your kind and detailed responses. You helped clear my doubt about scaling!
Kind Regards,
Synat