Hello all and merry Christmas!
I have expression profile of 27 patients (10 Relapse and 17 non-Relapse). Based on GSEA, I found two pathways (tgfb and mtor) are overrepresented in Relapse cases.
I used mouse models for each of these two pathways to make a classifier and applied them back to my patients to see which model/pathway can stratify my patients better.
My question is that which statistical approach is the best one to show that, for example a combination of both classifier works better? Do you think using positive predictive value (PPV) and NPV is good? or Fisher exact test?
Many thanks for any help!
Thank you @Kevin for your time and response.
Here, I don't have any training or test test. In fact, I did PAMR to find the best gene subset in my mouse model data that could classify them. Then I used NTP in GenePattern to see whether this gene list from mouse can classify my patients (27 Patients: 10 Relapse, 17 non-Relapse).
I have 3 different mouse models. For each models, I did the above mentioned process. Each kind of mouse models could predict patients data differently.
For example, mouse model with activated mutation in tgfb could correctly predict 8 of relapse cases and 11 of non-Relapse. The other mouse model could correctly predict 7 and 12 respectively; and the combination of both models could correctly predict 7 and 14 respectively.
Now I want statistically show that for example the combination model has more balance in terms of sensitivity and specificity. What is the best way to show it?
Many thanks for your help!!
Generally you will want the model with lowest error and highest 'predictive' potential, which can be gauged by looking at the various metrics that I have listed.
Thanks Kevin. Can I use ROC!? How?