Hi everyone, I'm trying to use GoSemSim but I'm struggling due to his results. I started using mgeneSim function, and i passed an array of 11000+ EntrezID genes. It gave me a similarity matrix with some columns and rows all containing value "1". I think is because, after mapping the EntrezID genes to GO, I noticed that some set of Go ID of two different genes have a GO ID in common.
To solve this problem, I tried to create a similarity matrix without mgeneSim, filling all the entries with the output of mgoSim for each couple of genes. In order to create this matrix, I calculated I need 30 like days, while mgeneSim just need a couple of hours.
Giving to mgeneSim and mgoSim the same parameters (measure="Wang", combine = "BMA") , the results are different. DO you know why?
How can I have consistent results from mgeneSim? Is it possible not to consider the GO ID two genes share?
A little example:
mgeneSim(c("3613", "83541", "5651", "23492", "157310"), semData=hsGO, measure="Wang", combine = "BMA", verbose = TRUE)
Now it works! I used ontology "MF", and it was the cause of my problems. I don't know why I didn't tried it before asking here, now I'll try to understand the differences between all the ontologies and why they outputs different values. Thank you so much!
Even if you use
, it will work. To run with gene symbols you need to usekeytype = "SYMBOL"