Error when looping over multiple columns in a data frame in R
1
1
Entering edit mode
6 months ago

I am trying to obtain certain cutoff value in multiple variables (columns 3: 69) of a data frame (name = Data). This is how data looks like The data What I wanted is to loop over the columns 3:69 (i.e. their values/numbers of course), to obtain the optimal cutoff value that could discriminate between stauses (Lables) of patients using the Youden method. This is the code I am using:

for (i in 1:ncol(Data)) {optimal.cutpoints(X =i, status = 'Label', 
                                         tag.health = 'Control',
                                         method = 'Youden', data = Data)}

I am not sure if i in 1:ncol would be the right thing to put in here ? When I run the code the following error appear

Error: Not all needed variables are supplied in 'data'.

I do understand that I R did not find the variables within the defined df (Data). but how can I formulate the for in loop to allow R applying the cutoff function for all columns at once especially given that the cutoff needs an X valu, which I assume should be the predefined i ?

Thanks

Statistics R • 689 views
ADD COMMENT
0
Entering edit mode
6 months ago
Jeremy ▴ 930

First of all, you should use "tag.healthy" and "methods" instead of "tag.health" and "method". Also, optimal.cutpoints() wants X to be either a character string or a formula. Right now, you're trying to feed it i, which is a number. Try the following code:

First, initialize a list to store the results.

cutpoint_results <- list()

Then loop through each column and apply optimal.cutpoints().

for (col_name in colnames(Data[3:ncol(Data)])) {
 cutpoint_results[[col_name]] <- optimal.cutpoints(
X = col_name,
 status = "Label", 
tag.healthy = "Control",
 methods = "Youden",
data = Data)
}

Actually accessing the cutoff values takes a little digging:

cut.list = list()

for(k in seq_along(cutpoint_results)){
cut.list[k] = cutpoint_results[[k]][['Youden']][['Global']][['optimal.cutoff']][['cutoff']]
}

names(cut.list) = names(cutpoint_results)
ADD COMMENT
0
Entering edit mode

Dear Jeremy, Thanks. What I could not understand why you wrote it like : cutpoint_results[[col_name]] ? Is it because it is a list results ? Also if I have a list of those cutoff (cut.list), how can I download it as an excel table ?

Thanks

ADD REPLY
0
Entering edit mode

cutpoint_results[[col_name]] adds each new cut point to the list entitled "cutpoint_results", while keeping the original column names from the Data dataframe. To output a file that can be opened in Excel, you can use as.data.frame(), followed by write.csv(). I always like to set row.names to "F" for the latter.

ADD REPLY

Login before adding your answer.

Traffic: 1931 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6