Question

How to analysis 2 color microarray data from GEO with limma?

1

Entering edit mode

4.2 years ago

MatthewP ★ 1.4k

Hello, everyone. I want to download and analysis dataset GSE149940 with limma, but there are some small questions I want to ask even after I did read some materials.
I can get expression matrix with _GEOquery_.

gse <- getGEO(filename = matrixPath, destdir = sourceDir, getGPL = FALSE, AnnotGPL = FALSE)
expr1 <- exprs(gse)

I've read usersguide of _limma_ package which descripts how to parse 2 color chip data and how to construct design model for this "dye-swap" design data.
I want to know can I parse the expresion matrix expr1 extract with _GEOquery_ to limma to do analysis directly? Or I need to download rawdata from GEO, and parse to _limma_ with read.maimages function? The second way seems quiet complicated to me for I've never accessed any rawdata of microarray before.

By the way, the data processing descripted in GEO is:

Agilent Feature Extraction Software (v 8.5.1.1) was used for background subtraction and LOWESS normalization. Normalized log10 ratio (Cy3/Cy5) representing test/reference for samples 2301 R – 2354 S, 2351 R – 2309 S, 2317 R – 2314 S, 2343 R – 2284 S, 2343 R – 2284 S, 2358 R – 2355 S, and 2367 R – 2369 S; normalized log10 ratio (Cy5/Cy3) representing test/reference for samples 2354 S – 2301 R, 2309 S – 2351 R, 2314 S – 2317 R, 2284 S – 2343 R, 2355 S – 2358 R, and 2369 S – 2367 R

So this seems how expr1 were produced.

limma • 1.6k views

ADD COMMENT • link updated 4.2 years ago by Gordon Smyth ★ 7.8k • written 4.2 years ago by MatthewP ★ 1.4k

score 5 · Accepted Answer · 2021-01-12

Yes, you can do either. You can analyse the matrix of normalized log-ratios that you get from GEO_query in limma. limma will accept the GEOquery object directly. Or you can read the raw data files into limma using read.images. Either way, the most important thing will be to setup the two-color design matrix appropriately.

The only limitation of using the GEO_query matrix is that you won't have access to A-values and hence you can't make MA plots or use the trend=TRUE option of eBayes.

Reading and normalizing the raw files is straightforward and gives full access to all limma capability:

> files
 [1] "GSM4518466_2301R_vs_2354S.txt.gz" "GSM4518467_2354S_vs_2301R.txt.gz"
 [3] "GSM4518468_2351R_vs_2309S.txt.gz" "GSM4518469_2309S_vs_2351R.txt.gz"
 [5] "GSM4518470_2317R_vs_2314S.txt.gz" "GSM4518471_2314S_vs_2317R.txt.gz"
 [7] "GSM4518472_2343R_vs_2284S.txt.gz" "GSM4518473_2284S_vs_2343R.txt.gz"
 [9] "GSM4518474_2358R_vs_2355S.txt.gz" "GSM4518475_2355S_vs_2358R.txt.gz"
[11] "GSM4518476_2367R_vs_2369S.txt.gz" "GSM4518477_2369S_vs_2367R.txt.gz"
> library(limma)
> RG <- read.maimages(files, source="agilent")
> RGb <- backgroundCorrect(RG, method="normexp")
> MA <- normalizeWithinArrays(RGb, method="loess")

This pipeline inter alia reads and collates the gene annotation from the raw files.