In simplest terms, ROC curve measures the quality of a binary classifier based on sorted predictions. The predictions can be on any scale, which means that your data can be used to make a ROC curve as is, or it can be scaled to a [0,1]
range which is where most binary classifiers will predict their values.
I will post a short Python code below that will show what I mean. Let's say that your qRT-PCR fold changes are in the range [0,25]
and that you have 25 gold standards measurements. I will simulate that by creating 25 random numbers in [0,25]
range. These are the numbers I got (rounded to 2 decimal places), though yours will be different each time you run the code.
[12.42 16.92 24.03 19.01 18.19 11.82 5.78 4.09 24.32 15.09 21.28 18.16 11.03 14.01 7.8 13.1 20.78 21.51 7.25 19.3 15.62 9.59 4.8 23.62 21.56]
Let's say that for each of those 25 experiments you have an array of 0 or 1 numbers, where 1s mean true positive and 0s mean true negative. Again, I will make 25 random numbers, which for my experment look like these:
[0 0 1 1 0 0 0 0 1 0 1 1 0 1 1 0 1 0 1 1 0 0 1 1 0]
When you calculate the ROC-AUC score for these two arrays of numbers, it comes out as 0.6730769230769231
. Now, you can scale the fold change numbers to be in [0,1]
range if you wish, because binary classifiers would typically give you numbers like that. Scaled fold change looked like this in my case:
[0.41 0.63 0.99 0.74 0.7 0.38 0.08 0. 1. 0.54 0.85 0.7 0.34 0.49 0.18 0.45 0.82 0.86 0.16 0.75 0.57 0.27 0.03 0.97 0.86]
As I said before, the scale of measurements/predictions doesn't matter, as the ROC-AUC score for the scaled array of numbers vs. original classes still comes out as 0.6730769230769231
The only thing you need to do is make a column of fold change numberss, and next to each of those numbers enter either 0 or 1 which you should know because this is a gold standard set. That's all the data you need, and then you need something that can plot a ROC curve, for example the roc_curve function in sklearn.
import numpy as np
from sklearn.metrics import roc_auc_score
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(0, 1))
random_fold = np.random.uniform(low=0, high=25, size=25)
print(np.around(random_fold, 2))
random_class = np.random.randint(low=0, high=2, size=25)
print(random_class)
print(roc_auc_score(random_class, random_fold))
scaled_random_fold = scaler.fit_transform(random_fold.reshape(-1, 1))
print(np.around(scaled_random_fold, 2).flatten())
print(roc_auc_score(random_class, scaled_random_fold))
Fold change can take an infinity of values, it is not the kind of "YES or NO" question" that ROC analysis are made for. ROC curves are a measure of performance of binary classifier prediction systems, so in your case, it would be useless unless you want to simply classify your genes between differentially expressed or not.
Why don't you just measure a correlation ?
Yes I want to check the discriminatory power of the genes to differentiate between the positive group and the negative group. Hence the ROC.