Entering edit mode
6.3 years ago
AQ7
▴
30
Goodmorning everyone,
I'm pretty new in decision tree models and I am trying to use it with metagenomis 16S data. My data looks like that
> str(data)
'data.frame': 156 obs. of 71 variables:
$ Class : Factor w/ 2 levels "cluster1","cluster2": 1 2 1 2 1 1 1 2 1 1 ...
$ Roseburia : num 0.738 1.228 5.414 0.468 0.232 ...
$ Anaerostipes : num 0.5978 0.0101 0.6828 0.3118 0 ...
$ g__.Ruminococcus. : num 0.330751 0 0.000615 0.114379 0.091286 ...
$ g__.Ruminococcus.torques : num 3.399 0.438 0.08 1.324 0.037 ...
$ Ruminococcaceae : num 48.16 4.31 16.84 37.66 9.07 ...
$ Faecalibacteriumprausnitzii: num 0.907 46.443 2.885 17.729 3.378 ...
$ Coprococcus : num 0.8083 0.0705 0.0258 0.5133 0 ...
$ Dehalobacterium : num 0 0 0 0 0 ...
$ Dialister : num 7.077 0.601 14.224 9.346 2.869 ...
$ Acidaminococcus : num 0.29891 0 0.00123 0 0 ...
$ Coriobacteriaceae : num 0.1132 0.00252 0.02214 0.60866 0.07402 ...
$ Streptococcus : num 0.7499 0 0.0418 0.3622 0.8536 ...
$ Eggerthellalenta : num 0.09197 0 0.00677 0.07625 2.22293 ...
$ Adlercreutzia : num 0 0 0.00123 0.06808 0 ...
$ Collinsellaaerofaciens : num 0.01061 0.01258 0.05782 0.33905 0.00493 ...
$ Actinomyces : num 0.11674 0 0.00246 0.00817 0.00987 ...
$ Bifidobacteriumlongum : num 0.3219 0.0654 0.1107 0.4385 0.2566 ...
$ Atopobium : num 0.373 0 0 0 0 ...
$ Turicibacter : num 0.40327 0.00755 0.00123 0.33905 0.27632 ...
and so on... and I'm running the following commands
library("RWeka")
data=read.table("Provadecision.txt",header=T)
DecisionTree <- J48(Class ~., data = data)
DecisionTree
summary(DecisionTree)
if(require("party", quietly = TRUE)) plot(DecisionTree)
or
library("rpart")
library("rpart.plot")
binary.model<-rpart(Class ~., data = data)
rpart.plot(binary.model)
But I get a completely different result with these two methods. I am pretty confused about it, which one is the right one for my situation? Could anyone please give me any suggestion about why I am getting two different results? thanks a lot Andrea
You're getting different results because you're using different algorithms. Weka's J48 is an implementation of the C4.5 decision tree algorithm while the rpart package implements the older CART algorithm. Check the docs of the package you're using. As for which one you should use, you should have some ways of evaluating/comparing the results otherwise, there's no way to decide.