Illumina Gene expression array analysis
2
0
Entering edit mode
9.8 years ago
datanerd ▴ 520

HI all,

I have huge set of microarray set (360 samples) run on a number of chips. Ian trying to analyze them. Currently I am using the bioconductor lumi package. I have run smaller samples before.

I want to make sure the data is good quality and if running all together is good. Can someone point to a tutorial or things (quality control points) to check to make sure its good?

Would appreciate your suggestion.

Thanks!

Mamta

illumina microarray-analysis • 2.4k views
ADD COMMENT
2
Entering edit mode
9.8 years ago

I don't recall if lumi offers all of the same normalization methods as GenomeStudio.

It's been a little while, but these are the QC metrics that I remember for Illumina expression arrays:

  1. Compare sample signal distributions (if I recall, even 'quantile normalization' in GenomeStudio wasn't only quantile normalization because there were still differences) and look for outliers
  2. Look for outliers using PCA, hierarchical clustering, etc.
  3. Compare clustering with different normalization methods. More specifically, compare how your groups of interest cluster under different conditions.

If I recall, I think background subtraction was important and I liked to see how the results differed for 'no normalization' versus 'quantile normalization'. I think I always skipped the imputation step.

Also, I always defined each sample as it's own group (in genome studio, not for statistical analysis). I seem to remember this being a bigger problem for the methylation arrays than the expression arrays, but I still think it is important. You can figure this out on your own with enough permutations of #3, but I didn't want to do this each time I had a dataset to analyze.

Depending upon your study design, you may also need to apply a batch correction.

ADD COMMENT
1
Entering edit mode
9.8 years ago
TriS ★ 4.7k

The package limma allows you to work with illumina Bead data.

The package is based on empirical Bayes statistics (eBayes()) and linear model fitting.

Are all the 360 from the same platform? if so normalization should be easier, as long as you have enough computer power to process them you should be fine with limma. it has a number of quality assessment tools to allow you to process and check post-normalization if you are all set for the analysis.

Also, how many conditions do you have? limma allows you to handle as many as you wish as long as you create the correct design matrix (basically to tell R which samples are which)

Hope this helps :)

ADD COMMENT
0
Entering edit mode

Thanks a lot :)

I still have trouble making plots in R for these 360 samples. See above. Any help/suggestion would be great!

ADD REPLY

Login before adding your answer.

Traffic: 1831 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6