Z-score difference between Z group means
1
0
Entering edit mode
21 months ago
schulpen_91 ▴ 30

Hello all, a question.

In a tutorial on differential expression analysis the following happened: Normalized data was converted to Z-scores and divided into two biological groups. A t-test was performed on the two groups (which is fine). Now the point of confusion. The means of the two groups were calculated separately, followed by "Z-difference=mean1 - mean2".

It was said this Z-difference signified biological difference and made the overall statistics more robust.

I understand that a Z-difference of -1 < Z-diff < 1 could indicate that two groups are closely related (e.g., low variation between groups). However, I can not find any online references about its specific use.

What is (in your opinion) the value of this Z-difference and how would it increase robustness?

This is the tutorial video in question. The Z-thingy is performed around the beginning. https://www.youtube.com/watch?v=JwiFoUWQUIg&list=PLDN1R5gNkbQw4JM3DOn9TzKAcpQz-Z9ym&index=4

Statistics differential-expression • 1.1k views
ADD COMMENT
0
Entering edit mode

Because I don't have access to raw data files DESeq2 and etc. did not seem too effective. I know I know, working in Excel is frowned upon here and it's not my preference either but hey :]

ADD REPLY
1
Entering edit mode
21 months ago
LChart 4.6k

The approach you describe does nothing except change a scale factor. Because

u = mean(rawdata)
s = sd(rawdata) 
Z = (rawdata - u)/s

Then, clearly

Z1 := mean(Z[group1]) = mean((rawdata[group1] - u)/s) = (mean(rawdata[group1]) - u)/s := R1/s - u/s

and

Z1 - Z2 = (R1/s - u/s) - (R2/s - u/s) = (R1 - R2)/s

Therefore all you've done is scale the between-group difference by the standard deviation of the sample. You haven't made anything "more robust" as group test statistics -- including the T-test -- are invariant under scaling and shifting the underlying data.*

However, changes scaled in this way have units "as a proportion of sample deviation", and can be useful to compare differences between features that originally are on disparate scales (such as height, weight, gene expression, pH levels, etc.) and are otherwise incommensurate.

* With -regularized- statistics as a caveat

ADD COMMENT

Login before adding your answer.

Traffic: 2069 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6