Hi,
I am using cancer cell line scRNA-seq data to fing rarely expressed cells in homogenous cell culture. I am following seurat vignette for clustering. I also fund the seurat tutorial to regress out cell cycle genes effect. However, I have a basic question. My current work is as follow:
1) Find variable genes
2) Scale data
3) Performed cell cycle scoring analysis
4) Re-scale data with regressing out cell cycle genes effect
5) Run PCA using variable feature generated in step one.
But I am thinking, shouldn't I find out variable genes again if I have made some cell cycle gene based correction, and then run PCA and clustering. Some of the variable genes are cell cycle genes such as TOP2A.
Any comment will be appreciated!
edited: I think the cell cycle based correction is just to filter some genes from your original HVGs. You should still have enough HVGs after filtering for clustering. If not, you could modify the threshold of step1 to get more initial HVGs.
Ok Thank you for your explanation.
Sorry, I made some mistake in my previous statement. Actually, it calculates a score based on the difference of mean expression of the given list and the mean expression of reference genes (randomly selected genes matching the distribution of the expression of the given list). Then regress out the scores for downstream analysis. Therefore, if you want to remove (hopefully) the cell cycle effect, you should do HVG analysis on the corrected data.