Entering edit mode
6.4 years ago
Za
▴
140
hi, I have a single cell RNA-seq matrix with 1500 cells. I followed below procedure but after having a library size, I don't know how to continue for normalisation
# filter out low-gene cells (often empty wells)
cd <- cd[, colSums(cd>0)>1.8e3]
# remove genes that don't have many reads
cd <- cd[rowSums(cd)>10, ]
# remove genes that are not seen in a sufficient number of cells
cd <- cd[rowSums(cd>0)>5, ]
# transform to make more data normal
mat <- log10(as.matrix(cd)+1)
# scale to 1 to 100 range
libSize <- colSums(cd > 0) # number of genes detected per cell as proxy
libSize <- libSize - min(libSize) + 1
libSize <- libSize / max(libSize)
libSize <- round(libSize*100)
Now I have mat as my matrix and libSize, I divided the mat by libSize but I think it is non-sense.
you can use scale function in R. For centering the value, It subtracts mean (from column values) and then for z-scores, it divides by standard deviation.
sorry in plotting heat map, that would be done but does not solve the problem. My problem is I have 9 time points in single cell RNA seq that 8 time time points are from icell8 and 0 hour time point from fuigdime, when I am plotting a heat map 0 hour time point is very yellow and does not allow for another time points to be seen. I thought may be this sort of library size solves the problem