Entering edit mode
2.0 years ago
prithvi.mastermind
▴
50
I'm interested in plotting KM curves for my expression data of genes. The data was normalized, log-transformed, z-score transformed, and batch-corrected. Therefore, I'm bifurcating the whole data into mid, low, and high groups for predicitng the KM survival curves based on expression value of genes. Is my approach correct?
library(survival)
library(survminer)
library(dplyr)
library(survMisc)
library(GGally)
library(survcomp)
library(RegParallel)
JMS <- read.delim(file = "LUAD_SURVIVAL.txt")
highExpr <- 1.96
lowExpr <- -1.96
JMS$LRRK2<- ifelse(JMS$LRRK2 >= highExpr, 'High', ifelse(JMS$LRRK2 <= lowExpr, 'Low', 'Mid'))
JMS$SNCA<- ifelse(JMS$SNCA >= highExpr, 'High', ifelse(JMS$SNCA <= lowExpr, 'Low', 'Mid'))
JMS$TDRKH<- ifelse(JMS$TDRKH >= highExpr, 'High', ifelse(JMS$TDRKH <= lowExpr, 'Low', 'Mid'))
JMS$LRRK2 <- factor(JMS$LRRK2, levels = c('Mid', 'Low', 'High'))
JMS$SNCA <- factor(JMS$SNCA, levels = c('Mid', 'Low', 'High'))
JMS$TDRKH <- factor(JMS$TDRKH, levels = c('Mid', 'Low', 'High'))
ggsurvplot(survfit(Surv(Time, Status) ~ LRRK2, data = JMS), data = JMS, risk.table = TRUE, pval = TRUE, break.time.by = 500, ggtheme = theme_minimal(), risk.table.y.text.col = TRUE, risk.table.y.text = FALSE)
ggsurvplot(survfit(Surv(Time, Status) ~ SNCA, data = JMS), data = JMS, risk.table = TRUE, pval = TRUE, break.time.by = 500, ggtheme = theme_minimal(), risk.table.y.text.col = TRUE, risk.table.y.text = FALSE)
ggsurvplot(survfit(Surv(Time, Status) ~ TDRKH, data = JMS), data = JMS, risk.table = TRUE, pval = TRUE, break.time.by = 500, ggtheme = theme_minimal(), risk.table.y.text.col = TRUE, risk.table.y.text = FALSE)
My data looks like this as stored in LUAD_SURVIVAL.txt:
Sample Status Time LRRK2 SNCA TDRKH sex stage
TCGA.38.4625.01A 0 2973 -0.383829368 -0.219402428 0.6500694 2 Stage I
TCGA.38.4627.01A 1 1147 -1.45793859 -0.022231288 -0.5830916 2 Stage II
TCGA.38.4632.01A 1 1357 -1.316342217 -0.162383739 1.146363 1 Stage IV
TCGA.44.2655.01A 0 1324 0.046189981 -0.773276551 2.024474 2 Stage I
TCGA.44.2657.01A 0 1351 -0.975033238 -0.256492538 0.171917 2 Stage I
TCGA.44.2661.01A 0 1159 -0.78239234 0.327366323 0.1126025 2 Stage I
TCGA.44.2662.01A 0 1280 -0.758485389 -1.437559471 1.273534 1 Stage I
TCGA.44.2665.01A 0 1301 -0.999667232 -0.440681708 0.009561531 2 Stage II
TCGA.44.2668.01A 1 761 -0.644422547 -1.445774753 0.9272611 1 Stage I
TCGA.44.3396.01A 0 1130 -1.245042532 -0.395511449 -0.1383681 2 Stage III
TCGA.44.6145.01A 0 595 -0.200550603 -1.307112908 1.290406 2 Stage I
TCGA.44.6147.01A 0 845 -0.297408216 -0.358287055 0.6861114 2 Stage I
TCGA.44.6776.01A 0 2616 0.668204053 -1.045513127 0.3604824 2 Stage I
TCGA.44.6777.01A 1 987 0.774933934 0.120296574 1.095746 2 Stage I
TCGA.44.6778.01A 0 1864 0.395187563 0.209133034 0.2749578 1 Stage I
TCGA.49.4490.01A 1 385 -1.751088203 -1.655575684 0.03521947 2 Stage III
TCGA.49.4512.01A 1 905 0.332495541 -0.456119236 1.154696 2 Stage III
TCGA.49.6743.01A 0 1621 -0.641826568 -0.79901757 1.403125 2 Stage III
TCGA.49.6744.01A 0 1683 -0.403411305 0.006217842 0.300926 2 Stage II
TCGA.49.6745.01A 0 522 0.046500089 -0.621497979 0.228453 1 Stage III
TCGA.49.6761.01A 0 354 -0.857810394 -2.056883586 1.437622 2 Stage III
TCGA.50.5930.01A 1 282 -0.87400339 0.093055616 1.570507 1 Stage III
TCGA.50.5931.01A 1 434 -2.277003104 -0.995236158 0.3775715 2 Stage I
TCGA.50.5932.01A 1 1235 -0.475064449 -2.294646772 1.211618 1 Stage II
TCGA.50.5933.01A 1 2393 -0.635493519 -0.647664896 1.576859 1 Stage III
TCGA.50.5935.01A 1 653 0.144475137 -1.286987274 2.029953 2 Stage I
TCGA.50.5939.01A 1 460 -0.217287921 -0.297524684 -0.1612658 1 Stage I
TCGA.50.6595.01A 1 189 -2.074430113 -1.31726386 -1.05404 2 Stage III
TCGA.55.6968.01A 1 1293 -1.510117962 -0.110645011 1.025638 1 Stage IV
TCGA.55.6970.01A 1 464 0.493536516 -0.785224767 0.8346801 2 Stage III
TCGA.55.6971.01A 0 1400 0.208688847 -0.399653744 0.9952134 2 Stage I
TCGA.55.6972.01A 1 1632 -1.125136651 -2.586160697 1.663106 1 Stage I
TCGA.55.6975.01A 1 118 -2.092639366 -0.831491754 -0.08286432 1 Stage II
TCGA.55.6978.01A 1 176 -1.239634382 -0.579709271 -0.248721 1 Stage II
TCGA.55.6979.01A 1 237 -0.591083585 -0.795370819 0.4372139 2 Stage II
TCGA.55.6980.01A 0 2109 -0.752056711 -0.697417076 0.7107928 1 Stage I
TCGA.55.6981.01A 1 1379 -1.439636573 -1.162752821 0.7467579 2 Stage III
TCGA.55.6982.01A 1 995 -0.559143952 -0.923362752 0.2045792 2 Stage II
TCGA.55.6984.01A 1 760 -0.991059081 -2.752376513 -0.6147963 2 Stage II
TCGA.55.6985.01A 0 1233 -0.503782815 -0.359211612 0.2640623 2 Stage I
TCGA.55.6986.01A 0 3261 -0.057757299 -0.395037493 0.161682 2 Stage I
TCGA.73.4676.01A 1 281 -0.938051892 -1.009113608 1.945119 1 Stage II
TCGA.91.6828.01A 0 323 -0.246057181 -0.825642817 1.327468 1 Stage I
TCGA.91.6829.01A 1 1258 -1.300016577 0.383268015 1.390523 1 Stage I
TCGA.91.6831.01A 0 310 -1.5698444 -1.168399177 0.8284916 1 Stage I
TCGA.91.6835.01A 0 79 -0.651069358 -0.390002878 1.839444 2 Stage I
TCGA.91.6836.01A 0 417 -1.78515033 -1.364946704 3.60664 2 Stage I
TCGA.91.6847.01A 0 842 -3.538866094 -0.87741001 0.9409349 2 Stage I
TCGA.91.6849.01A 0 35 -0.202941284 0.13610892 1.131687 2 Stage III
Plot of one of the genes LRRK2 is as below: