Hi everyone, I am working on the TCGA cancer cohort where I have RNA-counts and clinical information merge into one big file. I wanted to split this file into a higher and lower expression based on the median value of one gene. I used two R scripts, unfortunately, both of them do not work as I was expecting: The first script split the data frame but keep only the genes count matrix with no matched clinical information, Which was something I wasn't expected. The second one was so memory intense takes ages to run then come up with an error.
First:
med<-median(df2$gene)
upper_median<-df[which(df2$gene >= med]
lower_median<-df[which(df2$gene < med]
Second:
med<-median(df2$gene)
upper<-split(df, which(df$gene >= med), drop = TRUE)
lower<-split(df, which(df$gene < med), drop = TRUE)
Any idea what I am missing or doing wrong??
Thank you very much! Imran