Question

drug sensitivity prediction

0

Entering edit mode

2.5 years ago

luffy ▴ 130

Dear all,

I have been working on OncoPredict. I was able to reproduce results of calcPhenotype() using the example data. But i am bit confused with the input data. What are column and rows in each datasets used

trainingExprData, trainingPtype and testExprData enter image description here

I have data downloaded from GDSC and i have my expresssion data (DEGs) obtained from DESeq2. How do i prepare the input data for calcPhenotype()

Thank you for your time

Any help would be appreciated

deseq2 oncopredict R drug • 1.6k views

ADD COMMENT • link updated 12 weeks ago by Luka • 0 • written 2.5 years ago by luffy ▴ 130

score 6 · Accepted Answer · 2023-06-28

Once you download GDSC data you will have file named: DataFiles.zip.

Extract the zip file
Convert DESeq2 expression data to log scale (if they are not): your_data_log_transformed.txt

then,

library(oncoPredict)
setwd("DataFiles/")

#Read GDSC2 response data. rownames() are samples, colnames() are drugs. 
trainingPtype = readRDS(file = "Training Data/GDSC2_Res.rds")
trainingPtype<-exp(trainingPtype)

#GDSC2 expression data for the vignette (it's a much smaller sampling)
trainingExprData=readRDS(file='Training Data/GDSC2_Expr (RMA Normalized and Log Transformed).rds')

#Read testing data as a matrix with rownames() as genes and colnames() as samples.
testExprData=as.matrix(read.table('your_data_log_transformed.txt`', header=TRUE, row.names=1))

#Additional parameters. 
batchCorrect<-"eb"
powerTransformPhenotype<-TRUE
removeLowVaryingGenes<-0.2
removeLowVaringGenesFrom<-"homogenizeData"
minNumSamples=10
selection<- 1
printOutput=TRUE
pcr=FALSE
report_pc=FALSE
cc=FALSE
rsq=FALSE
percent=80

#Run the calcPhenotype() function using the parameters you specified above.
calcPhenotype(trainingExprData=trainingExprData,
              trainingPtype=trainingPtype,
              testExprData=testExprData,
              batchCorrect=batchCorrect,
              powerTransformPhenotype=powerTransformPhenotype,
              removeLowVaryingGenes=removeLowVaryingGenes,
              minNumSamples=minNumSamples,
              selection=selection,
              printOutput=printOutput,
              pcr=pcr,
              removeLowVaringGenesFrom=removeLowVaringGenesFrom,
              report_pc=report_pc,
              cc=cc,
              percent=percent,
              rsq=rsq)

All these steps are documented well in calcPhenotype