Hi all
I am working with data coming from two different platforms: HGU95Av2 and HGU133plus2. Ultimately I want to find differentially expressed genes.
I wanted to start from scratch using the CEL files but I want to find the best way to normalize them.
So far I'm:
1. Normalizing separately HGU95Av2 from HGU133plus2 using expresso
(bgcorrect.method="mas",normalize.method="quantiles", pmcorrect.method="mas",summary.method="medianpolish")
2. Match the probes across platforms using biomaRt, keep only those that match in both
3. Combine the data.
The boxplot I get is not awful but is not as pretty as I'd like it since you can see on the left side the samples from HGU95Av2 being at slightly lower intensity (link to boxplot here: https://drive.google.com/file/d/0BxzhXZ5eBptDMG03bWJXeE9sS2s/view?usp=sharing).
What would you guys suggest?
I forgot that
limma
has a bunch of normalization options. I ended up usingnormalizaBetweenArrays(x, "quantile")
Thanks :)
Yes, Quantile normalization is best when you intersect datasets from different platform/experiments
However, the limma manual says that it should be non-normalized data...so would it be correct to just use
bg.correct()
and extract those values for the normalization with limma?