Question

Using variables into data frame names in R

0

Entering edit mode

6.4 years ago

Famf ▴ 30

I have several text files (animal_grp1.txt, animal_grp2.txt...animal_grp50.txt) I want to import and modify in R by using a for loop.

I usually do this:

lote1 <- read.delim("animal_grp1.txt", header = T, sep = "\t")
lote1 <- lote1[c(2,5,7)]
names(lote1) <- c("ID", "race", "age")

but it is not efficient to repeat this code 50 times, so I was thinking that maybe by using a loop it would be more efficient.

Something like that:

animals <-c(1,2,3,4,5)
for(i in animal) {
 lote[i] <- read.delim("animal_grp[i].txt", header = T, sep = "\t")
 lote[i] <- lote[i][c(2,5,7)] 
 names(lote[i]) <- c("ID", "race", "age") 
}

It seems it is not so simple as I thought. Any help?

Thanks

R loop • 8.5k views

ADD COMMENT • link updated 6.4 years ago by russhh 5.8k • written 6.4 years ago by Famf ▴ 30

0

Entering edit mode

Sorry, I didn't see that you wanted to produce lote1, lote2, ... each as separate variables. If you do this it will cripple your work at a later stage - if it's inefficient to write 50 different lines of code to read in the data, it'll be just as inefficient to write 50 different lines to analyse the imported data. If you've got some large number of homogeneously structured files, and you're not up-to-speed with map/lapply in R, you'd be better constructing a list

lote <- vector(n = length(animals), mode = "list")

In the list lote[[5]] would return the value stored in the 5th element and lote[[7]] <- some_value would store some_value in the 7th entry.

So you could modify your code:

for (i in animals) { file_name <- paste0("animal_grp", i, ".txt") df <- read.delim(file_name, header = TRUE, sep = "\t")[c(2,5,7)] names(df) <- c("ID", "race", "age") lote[[i]] <- df }

ADD REPLY • link 6.4 years ago by russhh 5.8k

0

Entering edit mode

6.4 years ago

russhh 5.8k

Install tidyverse Then map over your filenames

library(tidyverse)
files <- dir(pattern = "animal_grp[[:digit:]]\+.txt") # I think

as.list(files) %>%
purrr::map(read_tsv, col_types = cols()) %>%
purrr::map(c(2,5,7)) %>%
purrr::map(set_colnames, c("ID", "race", "age"))

Sorry if this doesn't work first time I can't debug it at the moment

ADD COMMENT • link 6.4 years ago by russhh 5.8k

score 4 · Accepted Answer · 2018-09-25

4

Entering edit mode

6.4 years ago

Benn 8.4k

You are almost there, you need to use the paste command to incorporate the i number.

animals <-c(1,2,3,4,5)

for(i in animals) {
  fileName <- paste0("animal_grp",i ,".txt")
  lote <- read.delim(fileName, header = T, sep = "\t")
  lote <- lote[c(2,5,7)] 
  names(lote) <- c("ID", "race", "age")
  assign(paste0("lote", i), lote)
}

ADD COMMENT • link 6.4 years ago by Benn 8.4k

0

Entering edit mode

Thanks for your answer. I got it with your suggestion. The only thing is that in addition of lote1, lote2, and lote3, I am also getting an extra data frame lote without number.

ADD REPLY • link 6.4 years ago by Famf ▴ 30

0

Entering edit mode

I think you need to learn more on R and programmation in general. lote variable is declared in your loop and his still in the scope of your session.

ADD REPLY • link 6.4 years ago by Nicolas Rosewick 11k

0

Entering edit mode

You'll also find i and fileName, just like lote these are temp objects only used in the loop. If that bothers you, you can add the following line after the loop or at the end of the loop as the last line (after assign..., but before }).

rm(lote, i, fileName)

ADD REPLY • link 6.4 years ago by Benn 8.4k