Hello there, In using randomForest, I understand that the Mean Decrease Gini output shows me the most important variables. However, mine are tiny. E.g. 0.043 to 0.003
Does it make sense establish the Gini cutoff at > 0 to identify top genes or metabolites for example?
The other question is: does it even make sense to use RF for very simple classifications e.g. only gender, as well as few samples (e.g. n = 16)?
Thanks!
Great answer! Thank you! In my case, the data is indeed few samples, but lots of observations and a simple check for class differences (gender). I've compared the RF to linear regression and some other supervised methods and found it to be very conservative if I go by the Gini scores (perhaps 200 just barely above 0). I'm looking to identify the top differentially expressed metabolites.