Multivariate logistic regression for gene expression
0
0
Entering edit mode
4.4 years ago
silsie645 ▴ 20

I am working on a project where I am supposed to perform a univariate and multivariate regression analysis for some specific genes that affect cancer (14 in all). After performing the univariate analysis with:

#testing each gene with the data 
Res1<-RegParallel(data=coxdata, formula='Surv(OverallSurvival_months, Death)~[*]', FUN=function(formula,data) coxph(formula = formula, data=data, ties='breslow', singular.ok=TRUE), FUNtype='coxph', variables=colnames(coxdata)[9:ncol(coxdata)], blocksize=2000, cores=2, nestedParallel=FALSE, conflevel=95) 
Res2<-RegParallel(data=coxdata, formula='Surv(TumorFreeSurvival_months, Rezidiv)~[*]', FUN=function(formula,data) coxph(formula = formula, data=data, ties='breslow', singular.ok=TRUE), FUNtype='coxph', variables=colnames(coxdata)[9:ncol(coxdata)], blocksize=2000, cores=2, nestedParallel=FALSE, conflevel=95) 

#filtering by logrank<0.01
Res1<-Res1[order(Res1$LogRank, decreasing = FALSE),]
final1<-subset(Res1, LogRank<0.01)
probe1<-gsub('^X','',final1$Variable)

Res2<-Res2[order(Res2$LogRank, decreasing = FALSE),]
final2<-subset(Res2, LogRank<0.01)
probe2<-gsub('^X','',final2$Variable)

#annotating top hits with biomart
mart <- useMart('ENSEMBL_MART_ENSEMBL', host='useast.ensembl.org')
mart <- useDataset("hsapiens_gene_ensembl", mart)
annotLookup1 <- getBM(mart = mart,attributes = c('affy_hg_u133a','ensembl_gene_id', 'gene_biotype','external_gene_name'), filter = 'affy_hg_u133a',values = probe1, uniqueRows = TRUE) 
annotLookup2 <- getBM(mart = mart,attributes = c('affy_hg_u133a','ensembl_gene_id', 'gene_biotype','external_gene_name'), filter = 'affy_hg_u133a',values = probe2, uniqueRows = TRUE)

#extract OS data for downstream analysis 
survplotdata1<-coxdata[,c('OverallSurvival_months','Death','X205027_s_at')]
colnames(survplotdata1)<- c ('OverallSurvival_months','Death','TPL2')                          

#set Z-scale cut-offs for high and low expression 
highExpr<- 1.0
lowExpr<- -1.0
survplotdata1$TPL2<-ifelse(survplotdata1$TPL2 >= highExpr, 'High',ifelse(survplotdata1$TPL2<= lowExpr, 'Low', 'Mid'))

#relevelling factors to have mid as ref level 
survplotdata1$TPL2 <- factor(survplotdata1$TPL2,levels = c('Mid', 'Low', 'High'))
ggsurvplot(survfit(Surv(OverallSurvival_months,Death)~TPL2,data=survplotdata1),data=survplotdata1,risk.table=TRUE,pval=TRUE,ggtheme=theme_pubr(), risk.table.y.text.col=TRUE,risk.table.y.text=FALSE,xlab='Time (months)')

I realised that only one of my plots had a pvalue <0.05. Can I still perform the multivariate logistic regression and if I can how do I do it please?

r microarray • 1.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 2500 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6