Question

Pre-Processing Microarray

2

Entering edit mode

12.7 years ago

Diana ▴ 930

Hello,

I have microarray data which contains 60 samples and 150 genes. I've normalized the data and checked for missing values in R. What other pre-processing steps are required before clustering this data to identify groups of genes with similar expression patterns across all samples?

Edit:

The data came from perturbation experiments using nanostring technology which is similar to microarray technology but not actually microarray. So each of my rows is a gene and the columns are samples and this data is normalized.

microarray • 3.5k views

ADD COMMENT • link updated 9.0 years ago by Biostar 20 • written 12.7 years ago by Diana ▴ 930

0

Entering edit mode

Can you add details about the platform. For example Affymetrix or Agilent or Illumina ?

ADD REPLY • link 12.7 years ago by Khader Shameer 18k

score 2 · Answer 1 · 2012-03-16

2

Entering edit mode

12.7 years ago

Nick H ▴ 40

Depends on your platform, but for Illumina bead array data, a complete analysis pipeline might look something like:

ID SPATIAL ARTIFACTS and IMAGE PLOTS
BASH OUTLIER & DEFECT ANALYSIS
BACKGROUND CORRECT DATA AND QA/QC
SUMMARIZE BEAD LEVEL DATA INTO BEAD SUMMARY DATA
NORMALIZE
ANALYSIS

However if you lack the individual bead level data, you will have to trust your provider's preprocessing, and start at 5.

There are several R packages available to help you depending on the platform, such as beadarray for Illumina bead arrays.

ADD COMMENT • link 12.7 years ago by Nick H ▴ 40

0

Entering edit mode

Only the Illumina platform uses beads. Other major platforms (e.g. Affymetrix, Agilent) use bait permanently fixed to a particular position on the chip.

ADD REPLY • link 12.7 years ago by David Quigley 11k

0

Entering edit mode

True- will amend answer to make that clear.

ADD REPLY • link 12.7 years ago by Nick H ▴ 40

0

Entering edit mode

I amended the question. I know how to check for missing values and sample outliers. what does background correct data, defect analysis and QA/QC mean?

ADD REPLY • link 12.7 years ago by Diana ▴ 930

0

Entering edit mode

If the data is "not actually microarray" but nanostring why not change the title to reflect that? On typical microarrays (i.e. with some kind of glass surface), there can be background fluorescence (and a myriad of ways to address it), there can be defects in the spots or the surface, which can mess up or influence fluorescence detection, and QA/QC is just a general term for a system of checks on assumptions and measurements in your system. My guess is that most people here are unfamiliar with nanostring data and what to expect.

ADD REPLY • link 12.7 years ago by seidel 11k

0

Entering edit mode

I thought the same...that most people wouldn't know about nanostring...but I do know nanostring is very similar to microarray

ADD REPLY • link 12.7 years ago by Diana ▴ 930

0

Entering edit mode

I thought the same...that most people wouldn't know about nanostring...but I do know nanostring is very similar to microarray which is why I put microarray instead of nanostring in the title

ADD REPLY • link 12.7 years ago by Diana ▴ 930

0

Entering edit mode

Can you tell me what sort of statistical tests can be done on expression data after the normalization step? Are there any easy-to-follow tutorials that you're familiar with that could help me in this?

ADD REPLY • link 12.7 years ago by Diana ▴ 930

0

Entering edit mode

@seidel Can you tell me what sort of statistical tests can be done on expression data after the normalization step? Are there any easy-to-follow tutorials that you're familiar with that could help me in this? – diana yesterday

ADD REPLY • link 12.7 years ago by Diana ▴ 930

score 0 · Answer 2 · 2012-10-25

0

Entering edit mode

12.1 years ago

GouthamAtla 12k

You can download the GeneSpring free trial and evaluate. It has pretty good functions. After that try to do the same steps in R.

ADD COMMENT • link 12.1 years ago by GouthamAtla 12k