Hi all, we have a study that consists of 16 individuals (8 case and 8 control) who had samples taken from 5 different brain regions (total 80 samples). We have identified that the expression differences between regions is much larger than between individuals, which is why we are interested in looking at the difference between case/control in each region separately. Therefore, for normalization and differential expression, we split up our samples by region, and performed normalization of the counts on each batch of 16 samples separately and passed into differential expression. We are using these normalized counts for downstream analysis after DE and will be regressing out covariates. We believed the above approach is better than normalizing all samples together and then using a subset of the normalized counts which will be further regressed out for covariates for downstream analysis, but we are interested in advice. Thanks.
Q: Is this a reasonable approach to perform normalization on subsets of samples to be used for downstream analysis, or should we normalize all samples together and subset by our interested samples. Our concern is that we are performing further operations on these normalized counts, so it affects downstream analysis greatly.
What method did you use to normalize the counts?
We have tried with edgeR's TMM normalization which was then passed into limma-voom for DE. We also tried our analysis with DESeq2. We have a number of covariates that we will be regressing out using the normalized matrices.