Asking for anyone's insight on this please:
I think I need to be hit with this from multiple angles/perspectives for this to settle in my mind.
Background: I am analyzing a publicly available dataset of human fetal pancreas development weeks 12 through 22. I am currently looking at one time point (week 12). I used Seurat to cluster the 644 cells that make up the week 12 time point into 25 clusters(potentially different cell types). I am now interested in seeing the mechanistic insights that can be gleaned from one cell cluster (cell type) from the 25 clusters (multiple cell types).
What I did: I subsetted the data out of the cluster of interest (let's say: 35 cells that make up cluster 3). I then log 10000 normalized the UMI counts from cluster 3, and then extracted the top 1000 variable features/genes. Following this, I transposed the results of the top 1000 variable genes by 35 cells and then ran pearson correlation matrix on this, to result in a 1000 gene by 1000 gene correlation matrix. I then generated a heatmap-dendrogram of the matrix.
Furthermore, I did the same heatmap-dendrogram generation on the entire 644 cells that make up the week 12 time point (rather than just 35 cells of cluster 3 within the 644 cells of the week 12 time point).
I'm planning to loop through the dataset and do this heatmap-dendrogram generation for each cluster, and am hoping to glean cluster/cell-type-specific gene coregulation for GRN analysis. What do you think about this?
My question: What is the difference between a heatmap-dendrogram of a gene-gene correlation matrix generated from a cell type (cluster of cells (ie. 35 cells of cluster 3 within the 644 cells of the week 12 time point)) versus multiple cell types together all at once(a bunch of clusters together simultaneously (644 cells of week 12))?
A more specific question is if someone could please answer: What information is each providing?
I have included low resolution images of both heatmap-dendrograms:
Heatmap-dendrogram of top 1000 variable features by top 1000 variable features from cluster 3
Heatmap-dendrogram of top 1000 variable features by top 1000 variable features from all clusters in week 12 time point
I may not be clearly understanding something here such as what is happening when I convert the genes by cells table to gene-gene pearson correlation matrix?
Very Respectfully, Pratik