Question

How to plot ROC curve using qRT-PCR data?

0

Entering edit mode

4.0 years ago

viveknair.veki • 0

I want to plot the ROC curve using the fold change values from the qRT-PCR data. The samples were first tested using a gold standard assay and those values will be used as actual values whereas the gene expression values will be used as "predicted values". Can someone guide me how to make the data set and do ROC analysis?

gene • 2.5k views

ADD COMMENT • link updated 24 months ago by Mensur Dlakic ★ 28k • written 4.0 years ago by viveknair.veki • 0

1

Entering edit mode

Fold change can take an infinity of values, it is not the kind of "YES or NO" question" that ROC analysis are made for. ROC curves are a measure of performance of binary classifier prediction systems, so in your case, it would be useless unless you want to simply classify your genes between differentially expressed or not.

Why don't you just measure a correlation ?

ADD REPLY • link 4.0 years ago by Carlo Yague 8.9k

0

Entering edit mode

Yes I want to check the discriminatory power of the genes to differentiate between the positive group and the negative group. Hence the ROC.

ADD REPLY • link 3.9 years ago by viveknair.veki • 0

score 3 · Answer 1 · 2020-12-10

In simplest terms, ROC curve measures the quality of a binary classifier based on sorted predictions. The predictions can be on any scale, which means that your data can be used to make a ROC curve as is, or it can be scaled to a [0,1] range which is where most binary classifiers will predict their values.

I will post a short Python code below that will show what I mean. Let's say that your qRT-PCR fold changes are in the range [0,25] and that you have 25 gold standards measurements. I will simulate that by creating 25 random numbers in [0,25] range. These are the numbers I got (rounded to 2 decimal places), though yours will be different each time you run the code.

[12.42 16.92 24.03 19.01 18.19 11.82 5.78 4.09 24.32 15.09 21.28 18.16 11.03 14.01 7.8 13.1 20.78 21.51 7.25 19.3 15.62 9.59 4.8  23.62 21.56]

Let's say that for each of those 25 experiments you have an array of 0 or 1 numbers, where 1s mean true positive and 0s mean true negative. Again, I will make 25 random numbers, which for my experment look like these:

[0 0 1 1 0 0 0 0 1 0 1 1 0 1 1 0 1 0 1 1 0 0 1 1 0]

When you calculate the ROC-AUC score for these two arrays of numbers, it comes out as 0.6730769230769231. Now, you can scale the fold change numbers to be in [0,1] range if you wish, because binary classifiers would typically give you numbers like that. Scaled fold change looked like this in my case:

[0.41 0.63 0.99 0.74 0.7 0.38 0.08 0. 1. 0.54 0.85 0.7 0.34 0.49 0.18 0.45 0.82 0.86 0.16 0.75 0.57 0.27 0.03 0.97 0.86]

As I said before, the scale of measurements/predictions doesn't matter, as the ROC-AUC score for the scaled array of numbers vs. original classes still comes out as 0.6730769230769231

The only thing you need to do is make a column of fold change numberss, and next to each of those numbers enter either 0 or 1 which you should know because this is a gold standard set. That's all the data you need, and then you need something that can plot a ROC curve, for example the roc_curve function in sklearn.

import numpy as np
from sklearn.metrics import roc_auc_score
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range=(0, 1))
random_fold = np.random.uniform(low=0, high=25, size=25)
print(np.around(random_fold, 2))
random_class = np.random.randint(low=0, high=2, size=25)
print(random_class)
print(roc_auc_score(random_class, random_fold))
scaled_random_fold = scaler.fit_transform(random_fold.reshape(-1, 1))
print(np.around(scaled_random_fold, 2).flatten())
print(roc_auc_score(random_class, scaled_random_fold))