Question

Question On The Construction Of Gene Coexpression Networks

1

Entering edit mode

10.7 years ago

avari ▴ 110

Hi all,

Firstly, I am a total newbie to this area so apologies if this is an obvious question.

I want to look at the temporal dynamics of gene co-expression in a particular brain area (against a control dataset) across three developmental windows. However I am unsure if I should summarize the data across specimens (donors) for each period first – for example by taking the mean gene expression per neurodevelopmental window. Just wondering what the standard approach is?

Best wishes and thanks in advance for any assistance,

gene • 2.2k views

ADD COMMENT • link updated 10.7 years ago by David Quigley 11k • written 10.7 years ago by avari ▴ 110

0

Entering edit mode

I suspect you're using a model organism, though you don't say. If so, are you using genetically identical subjects (e.g. all C57BL/6 mice) or organisms from a mixed genetic background (e.g. wild moose)? Please update your question to reflect the study design.

ADD REPLY • link 10.7 years ago by David Quigley 11k

0

Entering edit mode

Hi, thanks for your reply. Actually am using RNA seq RPKM data from the Allen Brain Atlas Developing Brain database and I want to look at the co-expression of a bunch of genes across the frontal cortex. I will perform a MDS analysis to look at differences in gene expression due to ethnicity and sex, but I am still doesn't solve problem of needing 1 value per gene per development period for gene co-expression analysis (if that is indeed the correct way to do it).

ADD REPLY • link 10.7 years ago by avari ▴ 110

0

Entering edit mode

Hi, Did you find an answer to your question? I am looking for exactly the same thing? Co-expression between pairs of genes on brain. Have you tried to read Allen data set and computing inner-product between expressions? I am thinking of doing but similar to you, I am a newbie. Please let me know if you find out. Thanks.

ADD REPLY • link 10.7 years ago by GenomicEnthusiast • 0

0

Entering edit mode

I’m afraid I haven't found a solution yet. But a friend of mine (who is experienced with gene expression data for cancer) suggested checking out how the data clusters first across individuals (I plotted PCA's and MDS plots in R). If there aren't too many inter-individual differences in gene expression across a brain region (unlikely) then it might be OK to use summary statistics. Also I guess if you have enough brain specimens using the mean should be OK due to the central limit theorem ?

Thanks for the inner-dot product suggestion, I will post an update if I get any good tips.

ADD REPLY • link 10.7 years ago by avari ▴ 110

score 1 · Answer 1 · 2014-03-26

1

Entering edit mode

10.7 years ago

David Quigley 11k

As a general rule I would advocate not summarizing the results. You lose the statistical strength of repeated measures, and you can easily fool yourself for individual data points where high inter-subject variance is concealed by just taking the mean (or some other summary). My own experience has been that there is often a large amount of variance in gene expression from the same tissue in genetically identical living organisms, and you can exploit this variance if you're measuring correlation. If I were modeling expression as a function of time I would look into using a repeated measures model to exploit the replicates.

ADD COMMENT • link 10.7 years ago by David Quigley 11k

0

Entering edit mode

Yes the consensus seems to be that summarizing the result is the wrong way to go. Unfortunately the plan is to perform analysis separately for each developmental period, otherwise I would look into your suggestion repeated measures suggestion, thanks again for the input though.

ADD REPLY • link 10.7 years ago by avari ▴ 110