After performing behavioral studies on animals we measured the hormone levels, color and behavioral activity with gene expression data to look for patterns. This I'd then help us to make more sense of the individual and group variation can be observed from RNAseq data. Sample size - ~50 Individuals. I had a list of significant differential genes expressed during each treatment.
How do I start this analysis ? suggestions please.
Thanks for your suggestions. Well, I had 5 experimental groups 5x10 = 50 samples totally. 1,2,3,4 - stress treatment and 5 - control group. While doing DGE's we compared among groups 1 - 2, 2-3, 3-4,4-5,5-1,5-2,5-3,5-4 to understand the DGE genes. Only in 2 groups (2-5,3-5) we find sig DGE's 50 and 120 respectively. I had hormone data for all the 5 groups. How do I proceed further linking behavioral data Vs. gene expression ?
Though I'm beginner, I'm not sure this the right approach to this analysis, suggestions please!
I have already provided a suggestion - please see my original answer.
Perhaps you will also find this of use: A: What is the best way to combine machine learning algorithms for feature selectio
HI Kevin,
I'm newbie to data-analysis research, I may be asking some naive question, I'm trying build lm model for 51 significant DGE's vs. cortisol hormonal levels. After DGE analysis unable to find any genes related to the cortisol metabolism its because I'm working with non-model organisms whose genome annotation has not been well annotated. Is it possible to build a model using 51 DGE's Vs. cortisol attribute containing only 30 variables ?
I tried this in R its saying arguments imply differing number of rows: 51, 30 (It only accepts equal number of rows in both cols)
If you mean like this,
lm(cortisol ~ gene1 + gene2 + ... + gene30)
, then it is unlikely that this model can be fit to the data. You will see that, in my original answer, I was implying that you should test each gene / DEG separately.The input data for modeling should be of the form: