Hello I'm trying to automate on a set of files, the realization of linear regression (lm) and the corresponding graph. example for 1 file :
score;EGFR_12;EGFR_24;EGFR_36;EGFR_48;EGFR_60;pa
-0,5992442;67,56217938;53,61312383;52,93430604;;;1
-0,6795702;53,28459074;57,23583761;43,94840102;51,36407098;;2
code R :
if (!require(devtools)) {
install.packages("devtools")
library(devtools)
}
install_github("larmarange/JLutils")
library(ggplot2)
library(JLutils)
ggplotRegression <- function (fit) {
require(ggplot2)
ggplot(fit$model, aes_string(x = names(fit$model)[2], y = names(fit$model)[1])) +
geom_point() +
stat_smooth(method = "lm", col = "red") +
labs(title = paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 5),
"Intercept =",signif(fit$coef[[1]],5 ),
" Slope =",signif(fit$coef[[2]], 5),
" P =",signif(summary(fit)$coef[2,4], 5)))
}
setwd("/Users/Desktop/global")
files <- list.files(path = "data", pattern = (".csv$"))
for (k in 1:length(files)) {
fname <- files[k]
cat(paste0("Now analyse data/", fname, "...\n"))
data <- read.csv2(paste0("data/", fname), header = T, stringsAsFactors = F, dec = ",")
x<-summary(lm(data$EGFR_12 ~ data$score))$coefficients
write.csv(x, file = paste0("summary/summary_egfr12", fname))
p1<- ggplotRegression(lm(data$EGFR_12 ~ data$score))
rm(data, x)
Sys.sleep(5)
y<-summary(lm(data$EGFR_24 ~ data$score))$coefficients
write.csv(y, file = paste0("summary/summary_egfr24", fname))
p2<- ggplotRegression(lm(data$EGFR_24 ~ data$score))
rm(data, y)
Sys.sleep(5)
z<-summary(lm(data$EGFR_36 ~ data$score))$coefficients
write.csv(z, file = paste0("summary/summary_egfr36", fname))
p3<- ggplotRegression(lm(data$EGFR_36 ~ data$score))
rm(data, z)
Sys.sleep(5)
a<-summary(lm(data$EGFR_48 ~ data$score))$coefficients
write.csv(a, file = paste0("summary/summary_egfr48", fname))
p4<- ggplotRegression(lm(data$EGFR_48 ~ data$score))
rm(data, a)
Sys.sleep(5)
b<-summary(lm(data$EGFR_60 ~ data$score))$coefficients
write.csv(b, file = paste0("summary/summary_egfr60", fname))
p5 <- ggplotRegression(lm(data$EGFR_60 ~ data$score))
rm(data, b)
Sys.sleep(5)
png(file = paste0("graphe_png/", fname), width = 350, height = "350")
multiplot(p1, p2, p3, p4, p5, cols = 5)
dev.off()
pdf(file = paste0("graphe_pdf/", fname))
multiplot(p1, p2, p3, p4, p5, cols = 5)
dev.off()
}
So I tried to create a loop that for each file in my data folder, creates an lm and then a graph by calling the ggplotRegression function.
error :
Now analyse data/commun_GLOBALalloscore_norm-imput-avechla.csv...
Error in data$EGFR_24 : object of type 'closure' is not subsettable
How to solve this error?
Thank you in advance
what I'm trying to do: I have a set of csv files in a directory for each file, I want to do a linear regression for EGFR_12 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value), EGFR_24 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value), EGFR_36 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value), EGFR_48 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value), EGFR_60 ~ score and save the lm plot in a variable (?) (with on this graph r2, p-value). Then make a multiplot of all the previously saved plots. But despite my research, I can't do what I want.
See my answer. The issue is that you delete the value of
data
even though your code (the loop) needs it all the time.despite spending time rewriting the script thanks to your answers, I still can't do it.
I also try :
But I can't do multiplot in a pdf or a png
Please stop using the answer field to add details. Use
ADD REPLY
.Can't to
is most uninformative. What is the problem. There is no point in adding so much code. Go through your code step by step without running the loop. Do it sequentially and find the part that is causing trouble. Setk <- 1
and then execute the command one after another. Check where the problem is, then try to explain it here without adding a lot of code. Try to focus on the essential problem.