Calculate The Weighted Mean Of Matrix In R: Fastest Way
2
1
Entering edit mode
14.2 years ago
Sirus ▴ 820

Hello everybody, I want to create a simple R script that calculates the center of a cluster, I X={X1,X2,...,X2} is my data matrix and U={U_ik} is the partnership of element Xi in cluster K, I want to do the following in a fastest way:

alt text

so here each Xi is multiplied by a weight and summed up and then divided by the sum of weights , so it is weighted mean, but here our X is a vector. There in R the function weighted.mean(), but it needs only numerical values. I have written a script well it works for one but for a big number of X it is slow the code is

U : membership matrix.
X : Our data.
m : parameter
K : cluster number K

function(U,X,m,k)
{
  Nominator <- matrix(0,length(X[,1]),1)
  Denominator <-0

  for(i in 1:length(U[1,])) #We go trought the elements of the cluster k
  {
    if(U[k,i]!=0)
    {
      Nominator <- Nominator+ (U[k,i]^m)* X[,i]
      Denominator <- Denominator + (U[k,i]^m)
    }
  }
  Nominator/Denominator
}
r matrix statistics • 10k views
ADD COMMENT
0
Entering edit mode

btw, if all your U matrix elements were only 0 or 1, why do you take to the power of m then?

ADD REPLY
0
Entering edit mode

You are doing it wrong: just figured that your code doesn't do what your formula says. you have to take X^m not U^m !

ADD REPLY
0
Entering edit mode

Here X values may be 0 or 1 but U values are real values. In fact they represent the membership degree of element Xj in cluster Ci

ADD REPLY
5
Entering edit mode
14.2 years ago
Michael 55k

How about this (it's a 1-liner):

foo2 <- function(U,X,m,k)  X %*% U[k,]^m / sum(U[k,]^m)

plus make some sample data:

U <- matrix(sample(c(0,1), size=10*10, replace=TRUE), nrow=10, ncol=10)
X <- matrix(rnorm(100), nrow=10, ncol=10)

let's try it out:

foo1 <- your function
>all (foo2(U,X,2,1) == foo1(U,X,2,1))
[1] TRUE

It will be about x-times faster for large data

ADD COMMENT
0
Entering edit mode

Nice work converting C into R. I didn't know about all(), that's a sweet shortcut for confirming your solution.

ADD REPLY
0
Entering edit mode

Waw, it seems fast, I will try it now , thank you for you precious help. my data is big

ADD REPLY
0
Entering edit mode

I have tried your methode with this sample if I do all (foo2(U,X,2,1) == foo1(U,X,2,1)) it is TRUE but all (foo2(X,U,2,1) == foo1(X,U,2,1)) it is is FALSE I think it needs only a small change, I will try to figure it out. And really thank you for this help, I am happy to see the equivalent code in one line.

ADD REPLY
0
Entering edit mode

Of course the result will be different if you exchange the parameters ;) The vector to matrix multiplication %% is sensitive to the order of arguments such that : x %% A = A %% t(x) (with x a row vector and t() transposition function, if I my remember linear algebra classes rule of thumb row times column* ;)

ADD REPLY
0
Entering edit mode

Yeah, that's true but I mean that normally the results of foo2 and foo1 must the same what ever is the order of the parameters. foo2(X,U,m,k) should be equal to foo1(X,U,m,k) because normally we are applying the same formula so we should have the same results.

ADD REPLY
0
Entering edit mode

no, because R is not strongly typed. for example you can matrix multiply a vector x with a matrix M: x %% M if you exchange order M %% x then the result is different. So I can define a totally valid function given x is a vector and M a matrix, if that is inverted the result can be different

ADD REPLY
0
Entering edit mode
14.2 years ago
Sirus ▴ 820

Thank you very much Michael Dondrup,

I have figured out the solution, inspired from your code bellow it is

foo1<-function(U,X,m,k)(t(t(X) * U[k,]^m) %*% c(rep(1,length(U[k,]))))/sum(U[k,]^m)

For the time of execution, the first one that I have wrote is fast in case the U matrix has a lot of 0's.

ADD COMMENT
0
Entering edit mode

So you changed the formula you wish to compute? My impression was that your function as well as foo2 are consistent with the formula you show for center_k

ADD REPLY

Login before adding your answer.

Traffic: 3744 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6