Reading in delta Ct values using HTqPCR
1
0
Entering edit mode
6.8 years ago
ww22runner ▴ 60

Hi everyone,

I have one csv file that contains the delta Ct values for several samples (columns) and genes as row names. I have samples from different Groups, Grp A, Grp B and Grp C. My data columns look something like this. Samples in Group A and B start with baseline disease condition while Group C samples are healthy.

Gene  GrpA1_baseline GrpA1_flare GrpA2_baseline GrpA2_flare GrpB1_baseline GrpB2_baseline

          GrpC1.1_healthy   GrpC1.2_healthy  GrpC2.1_healthy   GrpC2.2_healthy

I have different conditions within Group A biological replicates (meaning samples), only baseline condition for Group B biological replicates, and I have 2 technical replicates for each sample in Group C. I was wondering how I could create a new qPCRset object using HTqPCR package in R. I understand that I could use the readCtData function but I am unsure of how to specify and tackle biological replicates and technical replicates or how to read all this in from one file.

The ultimate goal is for me to find differentially expressed genes between different groups (GrpA_flare vs GrpC_healthy) for example. As I am completely new to analyzing Ct values, any advice at all about how I could go about doing this would be greatly appreciated. Thank you.

R htqpcr • 3.7k views
ADD COMMENT
0
Entering edit mode
6.8 years ago

Just as per cDNA microarray studies, you can specify these in a single file that also lists the file-names that are to be read. Please take a look at Section 3 (page 6) of HTqPCR - high-throughput qPCR analysis in R and Bioconductor. However, you imply that you only have a single CSV file?

If you have to set the metadata manually, then do something like this (assumes that your object is called qPCRraw:

pData(qPCRraw) <- data.frame(
  Group=c(rep("A1", 2), rep("A2", 2), "B1", "B2", rep("C1", 2), rep("C2", 2)),
  Disease=c("baseline", "flare", "baseline", "flare", rep("baseline", 2), rep("healthy", 4)))
pData(qPCRraw)

   Group  Disease
1     A1 baseline
2     A1    flare
3     A2 baseline
4     A2    flare
5     B1 baseline
6     B2 baseline
7     C1  healthy
8     C1  healthy
9     C2  healthy
10    C2  healthy

The way that the replicates are then managed, computationally, is dealt with in each downstream function, and you typically have to implicitly specify the column in pData(qPCRraw) that relates to the replicates.

ADD COMMENT
0
Entering edit mode

Hi Kevin, Thank you for replying, that was useful! I cannot even seem to input my data to create a qPCRset object. This is how my dataframe that contain delta Ct values look like:

 ID_REF GSM2437228 GSM2437229 GSM2437230 GSM2437231 GSM2437232 GSM2437233
    1   ABCA1   2.741219   2.822219   4.275552  5.6885521   4.581885  3.4185521
    2    ACP1   4.388677   2.662677   2.910011  3.7580108   2.712344  2.5180108

I have 347 genes under the column name ID_REF and 337 samples that includes Group A, B and C. I tried the following to create a qPCRset object:

> readCtData('gene_data.csv', n.data= 337, n.features = 99 , header = TRUE, column.info=list(feature=1, Ct=337))

but I see the following error:

Error in `[.data.frame`(sample, , column.info[["Ct"]]) : 
  undefined columns selected
In addition: Warning message:
In .readCtPlain(readfile = readfile, header = header, n.features = n.features,  :
  347 gene names (rows) expected, got 347

Would you happen to know what I might be doing wrong?

ADD REPLY
0
Entering edit mode

Yes, that is because readCtData expects a file-listing of qPCR files for input, and not a data-matrix of values.

You may have to create it manually, like we sometimes had to do with microarray data, such as:

expression <- read.csv('gene_data.csv', ...)
rownames(expression) <- expression[,1]
expression <- data.matrix(expression[,-1])


qPCRraw <- qPCRset(featureNames=rownames(expression), sampleNames=colnames(expression), exprs=expression)

pData(qPCRraw) <- data.frame(Group=c(rep("A1", 2), rep("A2", 2), "B1", "B2", rep("C1", 2), rep("C2", 2)), Disease=c("baseline", "flare", "baseline", "flare", rep("baseline", 2), rep("healthy", 4)))
ADD REPLY
0
Entering edit mode

Hi Kevin,

Thank you for your reply. Please don't mind, but I would like to follow up with a few more questions.

Firstly, how can I carry out

> rownames(expression) <- expression[,1]

when I have duplicate gene names? I end up seeing this error:

Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘C3AR1’, ‘CD14’, ‘CD1C’, ‘CXCR2’

Also, when I try this command:

> qPCRraw <- qPCRset(featureNames=rownames(expression), sampleNames=colnames(expression), exprs=expression)

it shows the following error:

Error in qPCRset(featureNames = rownames(expression), sampleNames = colnames(expression),  : 
  could not find function "qPCRset"

Thanks again for your time.

ADD REPLY
0
Entering edit mode

Hello again,

Do you know why there are duplicate genes? To overcome this as a quick fix, you can try (this simply adds a number beside each gene, which will make the rownames unique):

rownames(expression) <- paste(expression[,1], 1:nrow(expression), sep=".")

For creating the qPCRset, you instead try the new() function:

raw <- new("qPCRset", exprs=data.matrix(expression))
sampleNames(raw) <- colnames(expression)
featureNames(raw) <- rownames(expression)

If you take a look at page 47 and 48 of the manual: https://www-test.ebi.ac.uk/bertone/software/HTqPCR.pdf

ADD REPLY
0
Entering edit mode

Hi Kevin,

The actual data contained gene names like this:

ABCA1-Hs01059118_m1
ACP1-Hs00962877_m1
ADAR-Hs00241666_m1
ADM-Hs00181605_m1
AGER-Hs00542590_m1

I am guessing that different probes map back to the same gene? In such a case, I thought it might be best to alter the gene names to something like this:

    ABCA1
    ACP1
    ADAR
    ADM
    AGER

This is how I ended up with duplicate gene names. In the manual you shared, qPCRraw also seems to have genes repeated and so I thought this was something that could be done. Please correct me if I am wrong.

If I use the original gene names with the probes and follow the steps you suggested:

>expression <- read.csv('gene_data.csv', sep =',', row.names = 1)
>raw <- new("qPCRset", exprs=data.matrix(expression))
>sampleNames(raw) <- colnames(expression)
>featureNames(raw) <- rownames(expression)

No errors show up, but once I wish to view raw, this error pops up:

> raw
An object of class "qPCRset"
Error in dimnames(x) <- dn : 
  length of 'dimnames' [1] not equal to array extent

Apologies for the additional errors!

ADD REPLY
0
Entering edit mode

Hmm... maybe double-check that the col and rownames are what you expect them to be?

ADD REPLY
0
Entering edit mode

Thanks Kevin, I shall do that.

ADD REPLY
0
Entering edit mode

If that does not work, then just take a look at the featureCategory parameter that can additionally be passed to the new() function.

In the example in the manual on page 47, they use featureCategory=as.data.frame(array("OK", (n, n))) , where n , n refer to the dimensions of the dataset that you are reading.

As a final guide, I notice that the author of the package lurks on the Bioconductor forums, so, your final option may be to post there, or just contact her directly (email listed here: https://bioconductor.riken.jp/packages/3.0/bioc/html/HTqPCR.html)

ADD REPLY

Login before adding your answer.

Traffic: 1875 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6