Hi,
I have Control cell line day1 to day 26 and Lead(pb) treatment day1 to day26. The experimental design aims to get the impact of Lead(pb) on developmental process .I must look if there are group of developmental genes that cluster differently in the lead(pb) experiment. For instance some genes are differentially expressed directly or others are coming in later time points. First I thought about WGCNA (figure) https://ibb.co/j9O5YS in which I think genes in significant modules in days 1-3 make sense for me. Then I thought about SOM clustering but clusters don't show any up or down regulation in day points
Any help please?
It depends on which definition of cluster you're interested in. For time series, I would compute a distance matrix based on dynamic time warping (see for example the R package dtw, be sure to read the FAQ down the page) and use this matrix as input for a clustering algorithm.
Thanks a lot, for example in picture A (cluster 28 with 103 genes), genes toward week 9 are being up regulated and in cluster 3 with 203 genes to the end of development down regulated.
https://ibb.co/h7N2yS
What I had in mind was for you to compute a distance matrix using dtw for each condition separately then use this for simultaneous clustering using for example a tensor factorization method as I wrote in a recent answer which you've seen.
Thanks a lot for sharing your idea. I have 52 samples (without replicates); control_day1 vs lead_day1 to control_day26 vs lead_day26. Then by dtw I must calculate 26 distance matrices for each control_day vs lead_day?????
Thanks again to help people to figure out
Sorry, this is my control and treatment matrices
I googled too much but too confusing, is this distance matrices by dtw you suggested for tensor factorization ??
Don't you get an error ? dist() is just the standard distance function in R, it doesn't know anything about dtw. Start by reading the dtw doc. What you need to do is apply the dtw() function to each pair of items for which you want the dtw distance and extract the distance from the resulting data structure e.g.
You may need to adjust to your needs, maybe use a symmetric version or the normalized distance. Check the doc.
Thank you, since creating this post I am googling for dtw but seems to complex. I did not get error and gives me a class 'dist' atomic
you mean I should put control and lead in this code?
That gave me
and this
Gave
item was just an example variable. Replace by what's appropriate in your case (control and lead maybe).
Sorry, Finally I was able to run your tutorial on my data.
I used two separate weighted adjacency matrices of control and treatment data (each 26 samples and 50 genes and I used genes as nodes).
my output matrices are not identical rather I can't understand how use the output to find which parts of the treatment-responses are specific to the treatment network?
https://ibb.co/f6DWen
How can I use this matrix to find responsive genes? Thank you for your time
Hi, I just found this post. I am dealing with the same problem and wondering whether I can get some help here. How did you solve it? Do you mind share the piece of code here? and did you manage to find responsive genes?
Thank you very much!
Hello Honestly I gave up on tensor flow that time and used WGCNA R package to correlate time points with treatment. This package correlates blocks of gene with given trait
You probably mean tensor factorization, tensorflow is a software library :)
The key questions are whether and how you want to make use of the time information. WGCNA summarizes all the information into one correlation matrix viewed as the adjacency matrix of a graph and proceeds under the assumption the graph has scale-free topology. This seems to be fine for most people but to me it discards information that could be interesting/useful. In any case, don't use an approach you don't understand.