Suppose I have data from a pilot gene expression study, i.e. normalized gene counts from an RNA-seq experiment. However, there are no replicates and multiple treatment groups (control, knockout1, knockout2, etc). I can easily cluster the treatments to see what is similar and what is different, but what's the best way to find biomarker genes that separate certain treatments from each other?
I've tried PCA, but the results are not sparse enough (i.e. I could mine the loadings, but there is some indication this is not effective, as in the Zou et al 2006). SparsePCA seems like it could work, but I can't find much mentioned on how this works with small sample sizes (no replicates). Would sparsePCA as in the elasticnet R package work?
All of these results are just hypothesis-generating. No p-value calculations or inference is needed. The primary goal is to figure out what genes' expression makes these different.
-Ted