WGCNA with time series data
1
2
Entering edit mode
10.6 years ago
julia ▴ 20

I have been going through a set of tutorials on WGCNA for building weighted gene co-expression networks. One particular tutorial uses Yeast expression data collected over 44 time points across cell cycles. To create an adjacency matrix, a Pearson correlation is calculated for each pair of genes using expression over time. However, as I understand it, a Pearson correlation is not appropriate for time series data because the data are correlated over time periods. I just wonder if anyone has an explanation for why it would be appropriate to use a Pearson correlation with Yeast time series data. I would like to use WGCNA to analyze time series data in which expression values are obtained at multiple time points in the same patient, and I need to be able to justify this. Thanks for your thoughts on the matter.

RNA-Seq R gene next-gen • 8.4k views
ADD COMMENT
0
Entering edit mode

Hi Julia, I have a similar question on my time course data. My thought is using DayX-Day0 data instead of DayX would be beneficial for avoiding the variations from individual baseline background (Day0). Do you think it's making sense? How did you do in your study? Thanks.

ADD REPLY
2
Entering edit mode
10.6 years ago
Ying W ★ 4.3k

I am assuming that you are referring to the Steve Horvath's Yeast tutorial. In this tutorial, Pearson's correlation is run on the different eigengenes (datME) and not directly on gene expression over time. You obtain correlation between the various module eigengenes and cluster based on that.

ADD COMMENT
0
Entering edit mode

Thank you for responding to my question, Ying. Yes, I was referring to Steve Horvath's Yeast tutorial. Very early in the tutorial, prior to the identification of eigengenes, an adjacency matrix is created using the WGCNA adjacency function (http://www.inside-r.org/packages/cran/WGCNA/docs/adjacency). This function uses a Pearson correlation to create the matrix and transforms r to a connection strength measure using a power function. In the case of the Yeast data, the expression values across time are correlated for every pair of genes in the dataset. I am interested to know if there is a reason why a Pearson correlation would be justified in this case. I would really like to use this analysis with time series data.

ADD REPLY
0
Entering edit mode

Could you please provide the link to this tutorial? I cannot find it.

ADD REPLY

Login before adding your answer.

Traffic: 2603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6