How to select columns from a dataframe of which the name ends with a specific word
3
1
Entering edit mode
3.5 years ago

Hi guys,

I've to create a variable a in which elements are column names of a database.

For example, I have the following column names of a df:

colnames(df)
"PS_01", "PS_01_mod2", "PS_02", "PS_02_mod2"

I want to create a vector a in which elements are the column names of df ending with mod2, so I want to create this situation

a <- "PS_01_mod2", "PS_02_mod2"

How can I do this in a simple way?

R • 1.1k views
ADD COMMENT
6
Entering edit mode
3.5 years ago

Hi, a simple grep() command will do this for you, and we can add a regular expression ('regex') to ensure positional specificity:

vector <- c('PS_01', 'PS_01_mod2', 'PS_02', 'PS_02_mod2', 'mod2_mod1')
idx <- grep('mod2$', vector)
vector[idx]
[1] "PS_01_mod2" "PS_02_mod2"

The dollar, $, means that we only want 'mod2' appearing at the end of a line.

Note the difference here, without the dollar:

idx <- grep('mod2', vector)
vector[idx]
[1] "PS_01_mod2" "PS_02_mod2" "mod2_mod1"

Kevin

ADD COMMENT
1
Entering edit mode

Or return the value, instead of index:

grep("mod2$", vector, value = TRUE)
# [1] "PS_01_mod2" "PS_02_mod2"
ADD REPLY
3
Entering edit mode
3.5 years ago

Look into grep, it's a very useful function in R and UNIX command as well.

a <- grep("_mod2$", colnames(df), value=TRUE)

the dollar sign indicates this pattern should be at the end only.

ADD COMMENT
3
Entering edit mode
3.5 years ago
zx8754 12k

Avoiding regex, using a dedicated function, endsWith:

vector[ endsWith(vector, "mod2") ]
# [1] "PS_01_mod2" "PS_02_mod2"
ADD COMMENT

Login before adding your answer.

Traffic: 3112 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6