I'm trying to generate a ROC curve for different RNA-seq normalisations, using qRT-PCR data for comparison. I want to define True Positives from the RNA-seq data as genes which are measured as significantly differentially expressed (FDR < 0.05) in the same direction (logFC) from the qRT-PCR data.
I am using the following function to generate my ROC curves (from this excellent single-cell RNA-seq tutorial):
DE_Quality_AUC <- function(pVals) {
# Only use RNA-seq genes which are also in the qRT-PCR dataset
pVals <- pVals[names(pVals) %in% GroundTruth$DE |
names(pVals) %in% GroundTruth$notDE]
# Create vector of classifications
truth <- rep(1, times = length(pVals));
# Classify true positives
truth[names(pVals) %in% GroundTruth$DE] = 0;
# Generate ROC/AUC values based on FDR from differential expression analysis
pred <- ROCR::prediction(pVals, truth)
perf <- ROCR::performance(pred, "tpr", "fpr")
# Plot ROC curve
ROCR::plot(perf)
# Return AUC value
aucObj <- ROCR::performance(pred, "auc")
return(aucObj@y.values[[1]])
}
However this does not take into account if the fold change between the RNA-seq and qRT-PCR data is in the same direction, and I'm struggling to come up with a way to incorporate this additional classifier, any ideas?
Maybe you could multiply
p.value
bysign( log fold-change )
?Presuming that you actually have the fold-change values from both RNA-seq and RT-qPCR, couldn't you just further work with your 'truth' vector and only retain values of 1 where the fold-change direction is also the same?
Kevin