Question

Test Whether The Variance In A Group Is Lower Than In Another

4

Entering edit mode

14.7 years ago

Giovanni M Dall'Olio 28k

I have two groups of data (not distributed under a normal distribution): I would like to test the hypothesis that the first group has a lower (or narrower) standard deviation than the other.

An alternative explanation to this is that I would like to tell whether the first group is less 'variable', 'heterogeneous', than the first.

A kruskal-wallis won't do it because it compares the medians of two or more groups, and I am not interested in that.

A Levene or a Brown-Forsynth test compare the variance between the two groups and tell whether they have the same variance. This is better, but I would also like to tell if the variance in the first group is lower than in the other(s) group(s).

A simple Chi-Square test would tell me whether the standard deviation of a group is equal to a certain value, and the one-tailed version can tell me whether it is higher/lower.

An additional difficulty is that I would have to do this test as a two-way, because I have two grouping variables, but I would like to ask you if you can point me to any direction or give me some hint, I have not many ideas on where to search :-)

statistics r • 15k views

ADD COMMENT • link updated 12 months ago by Ram 44k • written 14.7 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

What is your non-normality assumption based on? Have you thought about transforming the data (with log transformation, for example) to be more normal?

ADD REPLY • link 14.7 years ago by Yuri ★ 1.7k

0

Entering edit mode

You might also want to ask that question on stats.stackexchange.com. It's populated by lots of true-blooded statisticians who eat this stuff for breakfast.

ADD REPLY • link 14.2 years ago by David Quigley 11k

0

Entering edit mode

Hi Giovanni, how did you end up solving this? I ran into a very similar problem.

ADD REPLY • link updated 12 months ago by Ram 44k • written 5.8 years ago by A. Domingues ★ 2.7k

Ram · Answer 1 · 2010-04-02

Is bootstrapping a possibility? Resample from your data, calculate the variance, repeat. This should leave you with a vector of bootstrapped variance estimates for each of your desired groups. Perform the appropriate test on those estimates (e.g., t-test if you're comparing two groups and the estimates turn out normally).

I think the boot package is the norm for resampling in R, but here's some untested code to clarify the idea:

n <- 1000
x <- rnorm(mean = 0, sd = 1, n = n)
y <- rnorm(mean = 0, sd = 1.1, n = n)

nboots <- 10000
bootvar.x <- vector(mode = "numeric", length = nboots)
bootvar.y <- vector(mode = "numeric", length = nboots)

for(i in seq_len(nboots)){
  bootvar.x[i] <- var(sample(x, size = n, replace = TRUE))
  bootvar.y[i] <- var(sample(y, size = n, replace = TRUE))
}

require(ggplot2)
#Probably a better way to do this
bootvar.x2 <- data.frame(var = bootvar.x, group = "x")
bootvar.y2 <- data.frame(var = bootvar.y, group = "y")
bootvars <- rbind(bootvar.x2, bootvar.y2)

ggplot(bootvars, aes(x = var, group = group, colour = group)) + geom_density()

t.test(bootvar.x, bootvar.y)

Disclaimer: I've read a bit about bootstrapping. Please don't assume I actually know anything. This is just a suggestion for something to check out.

score 3 · Answer 2 · 2010-09-30

3

Entering edit mode

14.2 years ago

hurfdurf ▴ 490

If this data is really non-normal, should you be using variance or standard deviation at all?

You might want to use more robust metrics like [?]median absolute deviation[?].

ADD COMMENT • link 14.2 years ago by hurfdurf ▴ 490

Ram · Answer 3 · 2010-03-25

2

Entering edit mode

14.7 years ago

Michael 55k

Look for the F-test or Bartlett's test. As your data is non-normal you need something more robust against deviation from normality. Leven's test is for example mentioned as an alternative

ADD COMMENT • link updated 12 months ago by Ram 44k • written 14.7 years ago by Michael 55k

0

Entering edit mode

thanks, I forgot to say that I also looked at the Bartlett's test, but discarded it because it is sensitive to departures from normality and my data is not normal. Thanks anyway.

ADD REPLY • link 14.7 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

Then Forsythe test maybe? Look at the section: "Comparison with Levene's test"

ADD REPLY • link updated 5.2 years ago by Ram 44k • written 14.7 years ago by Michael 55k

score 2 · Answer 4 · 2010-10-01

2

Entering edit mode

14.2 years ago

Alastair Kerr 5.3k

Another alternative:

Transform the data by subtracting the mean (or median) from each data point and take the absolute values.

Now check the normality of each sample again and use a t-test or KS test as appropriate.

ADD COMMENT • link 14.2 years ago by Alastair Kerr 5.3k

score 1 · Answer 5 · 2010-03-25

You can try a Friedman test at first for each factor (assuming they're independent) and, given that really there is some difference, proceed an adequate multiple hypothesis testing using Bonferroni method, for example. Not a sequential hypothesis testing like we usually do with microarray data. You'll need to specifiy all concurrent hypothesis (variance =, <, >) and significance/power levels.

I don't know much about your experimental/test design. You could furnish additional detais.