Hi.
I followed the standard Seurat workflow on my data and noticed a cluster with low UMI counts. When I performed differential expression analysis (DEG) by comparing this cluster to all other clusters, I found that many genes were expressed in almost every cell in the dataset. While this cluster expressed those genes highly, the pct.1 and pct.2 values were nearly the same, indicating no specific markers for this cluster.
To address this, I set a threshold on nCount_RNA (UMI) and removed all of those cells from the data. Then, I repeated the Seurat workflow.
However, a similar cluster formed again, expressing almost the same genes, with pct.1 and pct.2 still very close.
I am confident that I removed those cells before the second round of clustering, but removing the cells does not prevent the occurrence of a cluster expressing the same genes, which are mostly lncRNA genes.
This specific cluster keeps forming, and removing cells doesn’t seem to change this.
Should I regress out those top genes in the deg result?