Entering edit mode
3.9 years ago
NDA
▴
20
Hi all
I would like to run network analysis using wgcna on my microarray dataset including 700 samples and 22000 genes. I have 2 questions:
- The dataset has a batch effect, so that using flashClust(), resulted in 2 discrete groups in patients and 2 discrete groups in controls. I already have read the viewpoints of Kevin Blighe and ivivek_ngs (Batch effects : ComBat or removebatcheffects (limma package) ?) about removing batch effect. Is it reasonable that I analyze a group as training and the other as validation?
- I've processed the data using neqc(), following by filtering them. Which method is better for selecting a set of genes for analysis. Coefficient of variation or MAD (median absolute deviation)?
Thanks in advance
Best,
Narges