Question

Performing differential gene analysis on RT PCR experimental data

0

Entering edit mode

6.8 years ago

ww22runner ▴ 60

Hi everyone,

I had a question with regards to a GEO dataset: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE92776. The link says that this data contains expression profiling from RT PCR and so if I downloaded the raw non-normalized data at the bottom of the page, would the values there be gene counts/Ct values or dCt values? Could anyone advise on how I could change the numbers to the right format to be used by Limma/DESeq or how else I can perform differential gene analysis on them to get log2FoldChange values?

Thanks!

RNA-Seq • 2.5k views

ADD COMMENT • link updated 6.8 years ago by Devon Ryan 104k • written 6.8 years ago by ww22runner ▴ 60

Devon Ryan · Accepted Answer · 2018-02-02

3

Entering edit mode

6.8 years ago

Devon Ryan 104k

The data processing section say the following:

Batch corrected delta Ct values are reported. Ct values were reported by the AppliedBiosystems SDS software. Delta Ct values were computed using the median of endogenous control transcripts. Finally, Delta Ct values were batch corrected using a linear model.

So delta Ct values. You would not use these values in limma or DESeq2, you would process them as you would any other qPCR dataset, which is to subtract the delta Ct values (the result, or delta delta Ct, is the log2 fold-change).

ADD COMMENT • link 6.8 years ago by Devon Ryan 104k

1

Entering edit mode

If delta Ct values are approximately normal distributed, I wonder why it would not be appropriate to use limma or standard t-test.

ADD REPLY • link 6.6 years ago by krishnapashu912 ▴ 40

0

Entering edit mode

Normally you could, I'm just not sure how the batch corrected values are distributed.

ADD REPLY • link 6.6 years ago by Devon Ryan 104k

0

Entering edit mode

Hi Ryan, Thank you so much for your reply, I was completely confused about this. If the values reported are delta Ct values, what exactly would computed via "median of endogenous control transcripts" mean? I guess I am unable to understand, for instance, what the delta Ct value for say, Gene A, from a patient in Group A with the disease flare would mean. (is it the Ct value being subtracted from that of a reference gene of some sort?)

Also, if I wanted to compare the logFC between Group A patients with flare (1 sample per patient) with the control group (2 samples per patient), how could I go about doing this? The data looks something like this:

ID_REF  GrpA1 FLARE GrpA2 FLARE GrpC1.1 HEALTHY GrpC1.2 HEALTHY GrpC2.1 HEALTHY Grp2C2.2 Healthy
               ABCA1    26.73608875 27.60308875 28.08308875 0.07308875  29.80608875 28.00008875

The number of patients in Group C and A are different. Would you suggest for this gene, that I average out the delta Ct values for Group A Flare patients, and do the same for Group C and then take the difference to get the logFC?

I apologize if my explanation is a little messy, I have no clue as to how to look for differentially expressed genes in this format. Thank you.

ADD REPLY • link updated 6.8 years ago by Devon Ryan 104k • written 6.8 years ago by ww22runner ▴ 60

0

Entering edit mode

The "median of endogenous control transcripts" is used for the other Ct value, which is used to compute delta Ct. Since you're unfamiliar with qPCR, I suggest you read a blog post or two regarding how it works. That should clarify things easily enough.