How do I remove columns of a dataframe based on the values of other dataframes?
0
0
Entering edit mode
2.9 years ago

If >50% of the subjects in BOTH groups (p0 and p1) have a p-value >.05, I want to remove the exon probe (columns) from the e dataframe.

p0 and p1 are subsets of p dataframe based on class memberships (0,1) so they are mutually exclusive. They are the p-values.

My code returns an empty del and does not change the dimensions of e dataframe.

del <- c()

for(col in colnames(e)){
#Check if the column is in both p dataframes
  isinp0 <- col %in% colnames(p0)
  isinp1 <- col %in% colnames(p1)

  #If it is not in the dataframe, then return false for the check. Otherwise, conduct the check and return its value
#If you sum over T/F values in R, it returns the total number true
#Take the number of successes and divide by the total number to get percentages
   isvalidp0 <- ifelse(!isinp0,F,sum(p0[col] < 0.05)/length(p0) > 0.5)

   isvalidp1 <- ifelse(!isinp1,F,sum(p1[col] < 0.05)/length(p1) > 0.5)

    if(isvalidp0 | isvalidp1){
    #Do nothing
    } else { #Mark column for deletion
    del <- c(del, col)
    }
}

#Select only those columns that do not need to be deleted
e <- e[!colnames(e) %in% cols_to_delete]
e=structure(list(JHU_113_2.CEL = c(4.21222, 1.46773, 6.28274, 4.27911,
5.81678), JHU_144.CEL = c(4.24054, 1.6898, 6.79161, 3.53146,
5.71165), JHU_173.CEL = c(3.55855, 1.54697, 6.11265, 3.83499,
6.02794), JHU_176R.CEL = c(4.57541, 1.75198, 6.13997, 3.71238,
5.37082), JHU_182.CEL = c(4.50411, 1.35377, 6.68056, 3.38309,
5.95527)), row.names = c(2315252L, 2315253L, 2315374L, 2315375L,
2315376L), class = "data.frame")
p0=structure(list(JHU_144.CEL = c(0.04224, 0.38068, 0.00293, 0.29977,
0.01525), JHU_186.CEL = c(0.03532, 0.28369, 0.00788, 0.10076,
0.03559), JHU_205.CEL = c(0.21461, 0.97292, 0.0672, 0.01755,
0.05689), JHU_210.CEL = c(0.36106, 0.33458, 0.00116, 0.07026,
0.11264), JHU_211R3.CEL = c(0.1347, 0.64219, 0.00873, 0.24551,
0.02603)), row.names = c(2315252L, 2315253L, 2315374L, 2315375L,
2315376L), class = "data.frame")
p1=structure(list(JHU_113_2.CEL = c(0.09655, 0.64864, 0.0073, 0.11744,
0.04079), JHU_173.CEL = c(0.22314, 0.49589, 0.03034, 0.21102,
0.0309), JHU_176R.CEL = c(0.03202, 0.38359, 0.02571, 0.21728,
0.08493), JHU_182.CEL = c(0.03889, 0.9356, 0.00436, 0.33313,
0.01303), JHU_187.CEL = c(0.06716, 0.39982, 0.0052, 0.32012,
0.02163)), row.names = c(2315252L, 2315253L, 2315374L, 2315375L,
2315376L), class = "data.frame")
r • 703 views
ADD COMMENT
0
Entering edit mode

Can you give small examples of your dataframes p0 p1 and e ?

ADD REPLY
0
Entering edit mode

Added the examples

ADD REPLY
0
Entering edit mode

The code you provided works perfectly on the example data on my machine (except the cols_to_delete which I assumed is del)

ADD REPLY

Login before adding your answer.

Traffic: 2132 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6