I am dealing with RNA-seq data of controls, patients that successfully responded to treatment (before and after treatment paired samples), and patients that did not respond to that treatment (samples were taken only after knowing that these patients did not respond). Therefore, I would like to propose genes that could predict the response (or not) to treatment.
I am not aware of which workflow I should follow to achieve my goal or which R packages could help me. I have compared the differentially expressed genes between responders (after treatment) vs non-responders, but do find appropriate to pick the most significantly deregulated genes with (possibly) scarce biological meaning or poor interconnections. I have also performed GSEA, so I know which pathways are altered, but I would not like to pick the ones I like the most (or arbitrarily because I like some pathway). I have read a bit about prediction analysis for microarrays (PAM) and found a method called shrunken nearest centroid method that identifies genes that best characterize each response group (Tibshirani et al PNAS 2002)... Do you think something like this could be useful in my case? Which R package I could use to do this analysis and also test its specificity, sensitivity, and accuracy? Any alternative solutions???
Thank you so much in advance!