Hi everyone,
I have a dataset and I want to add another column where I want to label based on the range.
My df is essentially Ages and I want to have another column that is Age Interval. My code works when my ages are over 0 but not sure how to do this when I have negative values (pcw). To achieve what I want, this is what I'm currently doing:
df$AgeInterval[df$Age >= -0.701 & df$Age <= -0.62] = "4-7pcw"
df$AgeInterval[df$Age >= -0.621 & df$Age <= -0.58] = "8-9pcw"
df$AgeInterval[df$Age >= -0.57 & df$Age <= -0.53] = "10-12pcw"
But there has to be a simpler way to do this?
Normally to do this I would use this for values over 0 and it gets the job done:
df %<>% mutate(age_interval = as.character(cut(Age, seq(-1, 100, by = 10)))) %>%
mutate(AgeInterval = sapply(age_interval, function(i) {
paste0(
as.numeric(gsub("^\\(([-0-9]+),.+", "\\1", i)) + 1,
"-",
as.numeric(gsub(".+,([0-9]+)\\]$", "\\1", i)), "yrs"
)})) %>%
dplyr::select(-age_interval)
How many categories do you have ?
13 in total
You could use
case_when
andbetween
within mutate. ie:strongly consider reading the docs or working thru the vignettes for for
data.table