Pre-Processing Microarray
2
2
Entering edit mode
12.7 years ago
Diana ▴ 930

Hello,

I have microarray data which contains 60 samples and 150 genes. I've normalized the data and checked for missing values in R. What other pre-processing steps are required before clustering this data to identify groups of genes with similar expression patterns across all samples?

Edit:

The data came from perturbation experiments using nanostring technology which is similar to microarray technology but not actually microarray. So each of my rows is a gene and the columns are samples and this data is normalized.

microarray • 3.5k views
ADD COMMENT
0
Entering edit mode

Can you add details about the platform. For example Affymetrix or Agilent or Illumina ?

ADD REPLY
2
Entering edit mode
12.7 years ago
Nick H ▴ 40

Depends on your platform, but for Illumina bead array data, a complete analysis pipeline might look something like:

  1. ID SPATIAL ARTIFACTS and IMAGE PLOTS
  2. BASH OUTLIER & DEFECT ANALYSIS
  3. BACKGROUND CORRECT DATA AND QA/QC
  4. SUMMARIZE BEAD LEVEL DATA INTO BEAD SUMMARY DATA
  5. NORMALIZE
  6. ANALYSIS

However if you lack the individual bead level data, you will have to trust your provider's preprocessing, and start at 5.

There are several R packages available to help you depending on the platform, such as beadarray for Illumina bead arrays.

ADD COMMENT
0
Entering edit mode

Only the Illumina platform uses beads. Other major platforms (e.g. Affymetrix, Agilent) use bait permanently fixed to a particular position on the chip.

ADD REPLY
0
Entering edit mode

True- will amend answer to make that clear.

ADD REPLY
0
Entering edit mode

I amended the question. I know how to check for missing values and sample outliers. what does background correct data, defect analysis and QA/QC mean?

ADD REPLY
0
Entering edit mode

If the data is "not actually microarray" but nanostring why not change the title to reflect that? On typical microarrays (i.e. with some kind of glass surface), there can be background fluorescence (and a myriad of ways to address it), there can be defects in the spots or the surface, which can mess up or influence fluorescence detection, and QA/QC is just a general term for a system of checks on assumptions and measurements in your system. My guess is that most people here are unfamiliar with nanostring data and what to expect.

ADD REPLY
0
Entering edit mode

I thought the same...that most people wouldn't know about nanostring...but I do know nanostring is very similar to microarray

ADD REPLY
0
Entering edit mode

I thought the same...that most people wouldn't know about nanostring...but I do know nanostring is very similar to microarray which is why I put microarray instead of nanostring in the title

ADD REPLY
0
Entering edit mode

Can you tell me what sort of statistical tests can be done on expression data after the normalization step? Are there any easy-to-follow tutorials that you're familiar with that could help me in this?

ADD REPLY
0
Entering edit mode

@seidel Can you tell me what sort of statistical tests can be done on expression data after the normalization step? Are there any easy-to-follow tutorials that you're familiar with that could help me in this? – diana yesterday

ADD REPLY
0
Entering edit mode
12.1 years ago

You can download the GeneSpring free trial and evaluate. It has pretty good functions. After that try to do the same steps in R.

ADD COMMENT

Login before adding your answer.

Traffic: 1736 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6