Hi there, In the Tutorial of ConsensusClusterPlus, the parameter clusterAlg could be redefined like:
clusterAlg This option specifes the type of clustering algorithm to use: "hc"for heirarchical clustering, "pam"for partioning around medoids, "km" for kmeans. Alternatively, one can supply their own clustering function, which should accept a distance matrix and a cluster number as its argu- ments and returns vector of cluster assignments having the same order as the distance matrix columns. For example, this simple function executes divisive clustering using the diana function from the cluster package and returns the expected object. The last line shows an example of how this could be used.
#library(cluster)
#dianaHook = function(this_dist,k){
#tmp = diana(this_dist,diss=TRUE)
#assignment = cutree(tmp,k)
#return(assignment)
#}
#ConsensusClusterPlus(d,clusterAlg="dianaHook",distance="pearson",...)
Thus I am using consensusClusterPlus to get consensus clustering based on NMF (non-negative matrix factorization). In theory I just need to get the group number corresponding to each sample (a vector of cluster assignments having the same order as the distance matrix columns) and put this into clusterAlg. I run k=3 and it can get the right input form just like the annotation but after I run ConsensusClusterPlus() I get an error. Btw, it seems that I cannot uplaod a demo of data. The dat used in this process is a matrix with rows of gene and colums of sample and the entries are gene counts.
The R code is descripted below:
library(NMF)
library(ConsensusClusterPlus)
NMF=function(affinity, k){
#affinity=dat;k=3
dimnames=list(rownames(affinity),colnames(affinity))
affinity=as.matrix(affinity,nrow=nrow(affinity),dimnames=dimnames)
lables=as.numeric(predict(nmf(affinity,rank=k)))
names(lables)=colnames(affinity)
return(labels)
#> lables
#TCGA-N5-A4RA TCGA-N5-A4RD TCGA-N5-A4RM TCGA-N5-A4RO TCGA-N5-A4RT TCGA-N5-A4RV TCGA-N5-A59E
#1 3 3 3 3 3 3
#TCGA-N7-A4Y5 TCGA-N7-A4Y8 TCGA-N8-A4PN TCGA-N8-A4PQ TCGA-N9-A4Q1 TCGA-N9-A4Q4 TCGA-NA-A4QW
#1 3 1 1 2 1 1
#TCGA-NA-A4QX TCGA-NA-A4R1 TCGA-ND-A4WF TCGA-NG-A4VU TCGA-N5-A4RF TCGA-N5-A4RU TCGA-N6-A4VD
#1 3 3 1 3 3 3
#TCGA-N6-A4VF TCGA-N7-A4Y0
#3 3
}
originalResult=ConsensusClusterPlus(
as.matrix(dat), maxK=5,clusterAlg="NMF",
distance="pearson",
reps=500, pItem=0.8, pFeature=1,
finalLinkage="average",,corUse="everything",
writeTable=F,weightsItem=NULL,
weightsFeature=NULL,
verbose=F)
And this is the Error I got:
Error in names(clusterAssignments) <- sampleKey :
names() applied to a non-vector
Therefore, what is the wrong step and how can I get consensus NMF?
I would be greatly appreciated if anyone can give me a hint.
What about lables vs labels ?
So careless I am. Thanks for reminding, it is running now. I should wait and see if it woud come to a reasonable result.
Dear sugus,
just from curiosity, i saw your very interesting post and implementation with NMF-did you got any "biologically" interesting or robust results with your above implementation ? I'm mainly asking because I'm also frequently using ConsensusClusterPlus for unsupervised class discovery in cancer, and your above approach seems interesting.
Best,
Efstathios