Hello,
I have following two text files with some genes
Text file one Cd5l Mcm6 Wdhd1 Serpina4-ps1 Nop58 Ugt2b38 Prim1 Rrm1 Mcm2 Fgl1
Text file two Serpina4-ps1 Trib3 Alas1 Tsku Tnfaip2 Fgl1 Nop58 Socs2 Ppargc1b Per1 Inhba Nrep Irf1 Map3k5 Osgin1 Ugt2b37 Yod1
I want to compute jaccard similarity using R for this purpose I used sets package
md1<-read.csv("T1.csv",sep=",",header = FALSE)
M1<-set(md1)
md2<-read.csv("T2.csv",sep=",",header = FALSE)
M2<-set(md2)
Sim1<-set_similarity(M1,M2, method="Jaccard")
But it gives jaccard coefficient 0 (means no similarity) but i know there is some overlap between the two text files. I am not able figure out whats the problem. Can any body suggest some solution or is there any other way to compute the jaccard coefficient? between the two text files with gene symbols.
Thanks,
This looks good, can we promote this to answer?
Sure. No problem .
Hi Jean,
Thanks for the nice solution it worked out :)
Nitin