Entering edit mode
5.2 years ago
star
▴
350
I have a big table, its rows are genomic coordinates and columns are the genomic features (like below). I would like to separate rows and columns based on the variability, I have tried to use some basic statistics like below codes, but I like to know is it the right way or is there an alternative (statistical) way that would be more accurate?
DF:
Feature_A Feature_B Feature_C Feature_D
cord_1 0.9 1 0.8 1
cord_2 0.6 0.1 0.9 0.5
cord_3 0 0 0 0
cord_4 0.1 0 0 0.2
codes:
DF$skew<-rowSkewness(DF)
DF$var <-rowVars(DF)
DF$sd <-rowSds(DF)
DF$IQR <- rowIQRs(DF))
DF$mean <- rowMeans(DF)
DF$coef.var <- DF$sd /DF$mean
I would like to consider cord_2 (as more variable) and ignore cord_1,3 and 4 in my output, so based on that, which statistic element is more better?
Use IQR!