R Error - lm- object of type closure is not subsettable
1
1
Entering edit mode
5.4 years ago

Hello I'm trying to automate on a set of files, the realization of linear regression (lm) and the corresponding graph. example for 1 file :

score;EGFR_12;EGFR_24;EGFR_36;EGFR_48;EGFR_60;pa
-0,5992442;67,56217938;53,61312383;52,93430604;;;1
-0,6795702;53,28459074;57,23583761;43,94840102;51,36407098;;2

code R :

if (!require(devtools)) {
  install.packages("devtools")
  library(devtools)
}
install_github("larmarange/JLutils")

library(ggplot2)
library(JLutils)

ggplotRegression <- function (fit) {

  require(ggplot2)

  ggplot(fit$model, aes_string(x = names(fit$model)[2], y = names(fit$model)[1])) + 
    geom_point() +
    stat_smooth(method = "lm", col = "red") +
    labs(title = paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 5),
                       "Intercept =",signif(fit$coef[[1]],5 ),
                       " Slope =",signif(fit$coef[[2]], 5),
                       " P =",signif(summary(fit)$coef[2,4], 5)))
}


setwd("/Users/Desktop/global")
files <- list.files(path = "data", pattern = (".csv$"))

for (k in 1:length(files)) {
  fname <- files[k]
  cat(paste0("Now analyse data/", fname, "...\n"))
  data <- read.csv2(paste0("data/", fname), header = T, stringsAsFactors = F, dec = ",")
  x<-summary(lm(data$EGFR_12 ~ data$score))$coefficients
  write.csv(x, file = paste0("summary/summary_egfr12", fname))


  p1<- ggplotRegression(lm(data$EGFR_12 ~ data$score))
  rm(data, x)
  Sys.sleep(5)


  y<-summary(lm(data$EGFR_24 ~ data$score))$coefficients
  write.csv(y, file = paste0("summary/summary_egfr24", fname))
  p2<- ggplotRegression(lm(data$EGFR_24 ~ data$score))

  rm(data, y)
  Sys.sleep(5)


  z<-summary(lm(data$EGFR_36 ~ data$score))$coefficients
  write.csv(z, file = paste0("summary/summary_egfr36", fname))
  p3<- ggplotRegression(lm(data$EGFR_36 ~ data$score))

  rm(data, z)
  Sys.sleep(5)


  a<-summary(lm(data$EGFR_48 ~ data$score))$coefficients
  write.csv(a, file = paste0("summary/summary_egfr48", fname))
  p4<- ggplotRegression(lm(data$EGFR_48 ~ data$score))

  rm(data, a)
  Sys.sleep(5)


  b<-summary(lm(data$EGFR_60 ~ data$score))$coefficients
  write.csv(b, file = paste0("summary/summary_egfr60", fname))
  p5 <- ggplotRegression(lm(data$EGFR_60 ~ data$score))

  rm(data, b)
  Sys.sleep(5)

  png(file = paste0("graphe_png/", fname), width = 350, height = "350")
  multiplot(p1, p2, p3, p4, p5, cols = 5)
  dev.off()

  pdf(file = paste0("graphe_pdf/", fname))
  multiplot(p1, p2, p3, p4, p5, cols = 5)
  dev.off()

}

So I tried to create a loop that for each file in my data folder, creates an lm and then a graph by calling the ggplotRegression function.

error :

Now analyse data/commun_GLOBALalloscore_norm-imput-avechla.csv...
Error in data$EGFR_24 : object of type 'closure' is not subsettable

How to solve this error?

Thank you in advance

R • 3.9k views
ADD COMMENT
0
Entering edit mode

what I'm trying to do: I have a set of csv files in a directory for each file, I want to do a linear regression for EGFR_12 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value), EGFR_24 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value), EGFR_36 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value), EGFR_48 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value), EGFR_60 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value). Then make a multiplot of all the previously saved plots. But despite my research, I can't do what I want.

ADD REPLY
0
Entering edit mode

See my answer. The issue is that you delete the value of data even though your code (the loop) needs it all the time.

ADD REPLY
0
Entering edit mode

despite spending time rewriting the script thanks to your answers, I still can't do it.

I also try :

setwd("/Users/amandinelecerfdefer/Desktop/global")
library(ggplot2)
ggplotRegression <- function (fit) {

  require(ggplot2)

  ggplot(fit$model, aes_string(x = names(fit$model)[2], y = names(fit$model)[1])) + 
    geom_point() +
    stat_smooth(method = "lm", col = "red") +
    labs(title = paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 5),
                       "Intercept =",signif(fit$coef[[1]],5 ),
                       " Slope =",signif(fit$coef[[2]], 5),
                       " P =",signif(summary(fit)$coef[2,4], 5)))
}
files <- list.files(path = "data", pattern = (".csv$"))

for (k in 1:length(files)) {
  fname <- files[k]
  cat(paste0("Now analyse data/", fname, "...\n"))
  data <- read.csv2(paste0("data/", fname), header = T, stringsAsFactors = F, dec = ",")
  head(data)


  fit1 <- lm(EGFR_12 ~ score, data = data)
  x<-summary(fit1)$coefficients
  write.csv(x, file = paste0("summary/summary_egfr12", fname))
  p1<-ggplotRegression(fit1)
  fit2 <- lm(EGFR_24 ~ score, data = data)
  x<-summary(fit1)$coefficients
  write.csv(y, file = paste0("summary/summary_egfr24", fname))

  p2<- ggplotRegression(fit2)
  fit3 <- lm(EGFR_36 ~ score, data = data)
  p3<- ggplotRegression(fit3)
  fit4 <- lm(EGFR_48 ~ score, data = data)
  p4<- ggplotRegression(fit4)
  fit5 <- lm(EGFR_60 ~ score, data = data)
  p5<- ggplotRegression(fit5)
}

But I can't do multiplot in a pdf or a png

ADD REPLY
1
Entering edit mode

Please stop using the answer field to add details. Use ADD REPLY. Can't to is most uninformative. What is the problem. There is no point in adding so much code. Go through your code step by step without running the loop. Do it sequentially and find the part that is causing trouble. Set k <- 1 and then execute the command one after another. Check where the problem is, then try to explain it here without adding a lot of code. Try to focus on the essential problem.

ADD REPLY
2
Entering edit mode
5.4 years ago
ATpoint 85k

Your code is highly redundant. The operations summary/write.csv/ggplotRegression appear multiple times. It would be smarter to write a single function which contains this basic workflow and then use loops or apply-like commands to run it on your data or on different columns of the csv you load. That will save you from the need to change the same command in different lines given you (maybe at some point) feel the need to modify this workflow.

The error you get generally means that you try to subset a function. I am not sure I understand your code. After the first code block within for (k in 1:length(files)) { you do a couple of things on data and then use rm(data). Then you do y<-summary(lm(data$EGFR_24 ~ data$score))$coefficients without loading something into the data variable. Therefore data is now interpreted as the function utils::data and this error comes up. Check the code on why you remove data variable after this first code chunk.

Generally it is not recommended to use variable names such as data, sum, apply or mean as all of these also represent function names. Use something unique, such as my.data or tmp.data to avoid this misinterpretation.

ADD COMMENT

Login before adding your answer.

Traffic: 1076 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6