I have to make a loop function in R which can be used for removing the complete missing rows in all of the different groups.
#Grouping of dataframe as a triplicate groups
grouping_data<-function(df){ #df= dataframe
df_col<-ncol(df) #calculates no. of columns in dataframe
groups<-sort(rep(0:((df_col/3)-1),3)) #creates user determined groups
id<-list() #creates empty list
for (i in 1:length(unique(groups))){
id[[i]]<-which(groups == unique(groups)[i])} #creates list of groups
names(id)<-paste0("id",unique(groups)) #assigns group based names to the list "id"
data<-list() #creates empty list
for (i in 1:length(id)){
data[[i]]<-df[,id[[i]]]} #creates list of dataframe columns sorted by groups
names(data)<-paste0("data",unique(groups)) #assigns group based names to the list "data"
return(data)}
new<-grouping_data(data_input)
By the above code, I have created a list of 'n' number of groups which are all having 3 columns.
My next step is I have to remove the complete missing rows (it may contain 1 or 2 missing values per row among the 3 columns) in all the 'n' number of groups of dataset. Then I have to combine all the new datagroups by using the logical function of 'AND' for easy combining of all the different datagroups as a single dataframe at last. And all the datagroups will have same row length.
test_data<- function (x){
data_file <- x %>%
dplyr::filter(
# First group
!dplyr::if_all(.cols = c(1, 2, 3), .fns = is.na), # removing rows if all columns 1, 2 and 3 are NA
# second group
!if_all(.cols = c(4, 5, 6), .fns = is.na) # removing rows if all columns 1, 2 and 3 are NA
)
}
data_new <- test_data(data_input)
The above code I have tried for 6 column dataframe (2 groups).
But since I am working for a mass spectrometry proteomics expression datasets, the number of groups value may be differ according to the datasets. The important thing is that all datagroup will have only 3 columns. I have attached the image of example dataset.
So kindly provide any R loop code which can be useful for this problem and can be used for any number of column datasets.
Can you provide a small reproducible example dataset, and an example of what you want the output to look like? You can share the data by using the
dput
function on the dataframes and copy/pasting the code here.Thank you sir, for your interest.
My example data set is,
And I want to get the result file like,
I am very much eager for your answer.