I'm interested in checking the association of a gene to some clinical parameters. For that I'm classifying the samples into high and low based on a gene GABRD
zscore values. I have the fpkm data and calculated zscore
.
I took the cutoff Z=1 (very relaxed threshold)
So, zscore >=1 are classified as GABRD high
. But I don't see any samples with zscore <= -1 to classify them into GABRD low
.
Is it ok if I take zscore >=1 as high
and
zscore <=1 as low
thanq.
I think that something went wrong with the analysis. Could you provide the plot of your data? In R it can be made like: plot(density(data)), you may remove the names and all the IDs. Having no samples with z-score < -1 is very, very suspicious. Most probably you should not use z-scores. You can use z-scores only if your random variable is distributed in a bell shaped manner (see answer below). Your distribution is likely right-skewed (https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/skewed-distribution/ )
That's the short answer (actually absence of the answer) why you should not use z-score:
https://stats.stackexchange.com/questions/32357/can-i-use-a-z-score-with-skewed-and-non-normal-data