How to preserve the order of dataframe variables in R?
1
0
Entering edit mode
2.2 years ago
KABILAN ▴ 130

I have a dataset like below,

structure(list(PCV = c(0.0178219194071478, 0.0167224679086922, 0.0313796054695457, 0.0272633405874291, 0.00992979365423812, 0.0163545593623028, 0.0125615766079409, 0.0438832908556275, 0.0260965005930162, 0.034959332834335, 0.00651124339985815, 0.00773667420172548, 0.00460174240773309, 0.00940417833578374, 0.00763277410224326, 0.0569674690437892, 0.00554001729154236, 0.0102426634114334, 0.0191710901533892, 0.0127379038986653, 0.00859900586552533, 0.00630188507834846, 0.000184250143156493, 0.00651494443035729, 0.00477417309479366, 0.0298096494477779, 0.0235443699348768, 0.00846982190170002, 0.0197493082323879, 0.00885420900157687, 0.00771739026182587, 0.0227915291110601, 0.000326021119179784, 0.00347808426299245, 0.00244844394159794, 0.0221243684669031, 0.00853034943193308, 0.0117734523728633, 0.00438879865028313, 0.00162737834039006, 0.00102263562640706, 0.00256966419093599, 0.00819905987547494, 0.00356380381933028, 0.00459378907571579, 0.0123769394422116, 0.0162725362822941, 0.00770364870061668, 0.0184835516883016, 0.00798092837759707, 0.00574272817857334, 0.00483107847770393, 0.0017089616030636, 0.00334660568350707, 0.0114543838108249, 0.00288212452973156, 0.00448938651825993, 0.00593444755414696, 0.0103782620446864, 0.00424463992722479, 0.0161764747677885, 0.0105032486560586, 0.061974812175287, 0.00528277075107687, 0.000766055202087631, 0.0198394482053174, 0.00734319673771724, 0.00571223067545781, 0.0061683142070276, 0.00170204019314863, 0.00484076438978875, 0.00222693661639841, 0.0204057550556842, 0.00494096746578935, 0.00642331357982557, 0.000845046692055484, 0.0234690091797697, 0.00520249711980663, 0.0141779818674367, 0.0946105742913523, 0.00496222530713291, 0.066585835547389, 0.000763194722436555, 0.0588866152937399, 0.00300507357098326, 0.0662912715588685, 0.00358567303889042, 0.0017549310798091, 0.0222871772118731, 0.00708496651557248), Type = c("knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr")), class = c("tbl_df", "tbl", "data.frame" ), row.names = c(NA, -90L))

I have to find the mean for each group of the data. I tried two different codes like,

data %>% group_by(Type)%>% summarise(mean_run = mean(PCV))

and result <- stats::aggregate(data$PCV, list(data$Type), mean)

Both the codes are giving the same kind of results. But the order of variables are changing automatically like below,

structure(list(Type = c("knn_loess", "knn_rlr", "knn_vsn", "lls_loess", "lls_rlr", "lls_vsn", "svd_loess", "svd_rlr", "svd_vsn"), mean_run = c(0.0140545756246163, 0.0116801617130501, 0.0236972387280275, 0.00827665570788852, 0.00550126183277225, 0.00852058159590288, 0.0177142846257907, 0.0235206963846695, 0.0135468591570967)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -9L))

But I need the order of variables like below,

structure(list(Type = c("knn_vsn", "knn_loess", "knn_rlr", "lls_vsn", "lls_loess", "lls_rlr", "svd_vsn", "svd_loess", "svd_rlr"), mean_run = c(0.0236972387280275, 0.0140545756246163, 0.0116801617130501, 0.00852058159590288, 0.00827665570788852, 0.00550126183277225, 0.0135468591570967, 0.0177142846257907, 0.0235206963846695)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -9L))

Kindly suggest some useful code for correcting this issue. Thank you in advance.

R data_variables_order data-frame • 649 views
ADD COMMENT
6
Entering edit mode
2.2 years ago
Basti ★ 2.0k

You could transform your Type variable to factor and set the levels you need :

data$Type=factor(data$Type,levels=c("knn_vsn", "knn_loess", "knn_rlr", "lls_vsn", "lls_loess", "lls_rlr", "svd_vsn", "svd_loess", "svd_rlr"))
ADD COMMENT

Login before adding your answer.

Traffic: 2656 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6