I am not sure, I understand the difference between these two functions, or more precise the idea behind the procedures of them.
When running the FindVariableFeatures
I try to identify those genes, which show a high cell-to-cell variation. Later in the analysis I also run the FindAllMarkers
function, which defines the clusters by calculating the differential expression between the clusters.
I have assumed, that those possible gene markers for each of the clusters would overlap with the genes identified in the first step of identifying the variable features. As this should be the idea behind it. Genes with a high variation between cell groups should be also specific for a certain cluster and therefore be found in both.
But in my data, when comparing the two lists I have some markers identified for certain clusters, which are not in the list of HVG genes.
I have found this out, when calculating the DoHeatMap
on the top features from my list of marker genes. This throws an error, telling me that some of my genes in the top marker genes' list were not found in the scale.data
slot of my seurat
object. This slot os calculated on the genes identified as HVG (2000 per default, if not stated otherwise).
What do I miss here? Why are these two lists not completely overlapping? How can it be, that I have significant gene markers for a specific cluster, which were not identified as highly-variable gene?
thanks for clarifying this.
Assa