Hi, I got the following R code from previously published paper, and got the graph from the code. How to interpret the graph to determine the number of clusters?
a <-read.table(file="Single_TPM.txt",header=T)
all <-a
c <- cor(all, method="pearson")
# To determine number of groups
distance_sum <-c()
for (k in 1:11){
branch=cutree(hr,k=k)
group_ids <-split(names(branch),branch)
avg_matrix <-all[,c()]
all_avg_matrix <-all
for (group.n in 1:length(group_ids)){
group.idx <-which(colnames(all) %in% group_ids[[group.n]])
avg_exp <-rowMeans(all[,group.idx])
all_avg_matrix[,group.idx] <-matrix(rep(avg_exp,length(group.idx)),ncol=length(group.idx),byrow=F)
}
distance_sum <-c(distance_sum,sum((all-all_avg_matrix)^2))
}
plot(1:length(distance_sum),distance_sum,type="l")
If there is any other method which suits well, please let me know, I use TPM vales from RSEM output for clustering!
Thanks for reminding me! - ConsensusCluster is yet another method.