Question

Normalization agilent microarray data?

0

Entering edit mode

5.3 years ago

mathavanbioinfo ▴ 80

Hello All I am analyzing Agilent microarray data to study the infection condition, I used networkanalyst.ca tool for analysis. The normalization was done using Variance Stabilizing Normalization. Herewith attached image produced after normalization. Please comment is this correctly normalized.

Than you enter image description here

Microarray Normalization • 4.3k views

ADD COMMENT • link updated 3.6 years ago by Ram 44k • written 5.3 years ago by mathavanbioinfo ▴ 80

0

Entering edit mode

We are [mostly] analysts here. Your description of your data processing steps are quite ambiguous. Please explain exactly how you processed your data, providing code used, where applicable.

You appear to have used networkanalyst.ca - is that correct?

ADD REPLY • link 5.3 years ago by Kevin Blighe 88k

0

Entering edit mode

Yes, I have used networkanalyst.ca tool, I have used microarray Series matrix file, that contain the gene expression value of the sample. enter image description here

ADD REPLY • link 5.3 years ago by mathavanbioinfo ▴ 80

0

Entering edit mode

This image represents before normalization

ADD REPLY • link 5.3 years ago by mathavanbioinfo ▴ 80

0

Entering edit mode

Okay, it is just not common to use networkanalyst.ca to process microarray data.

You can also obtain normalised data by simply doing (in R):

library(Biobase)
library(GEOquery)

# load series and platform data from GEO
gset <- getGEO("GSE93861", GSEMatrix =TRUE, getGPL=FALSE)
if (length(gset) > 1) idx <- grep("GPL6480", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]

# set parameters and draw the plot
dev.new(width=4+dim(gset)[[2]]/5, height=6)
par(mar=c(2+round(max(nchar(sampleNames(gset)))/2),4,2,1))
title <- paste ("GSE93861", '/', annotation(gset), " selected samples", sep ='')
boxplot(exprs(gset), boxwex=0.7, notch=T, main=title, outline=FALSE, las=2)

ADD REPLY • link 5.3 years ago by Kevin Blighe 88k

0

Entering edit mode

I have done it, can please share the code for DEG analysis

ADD REPLY • link 5.3 years ago by mathavanbioinfo ▴ 80

1

Entering edit mode

Are you asking Kevin for exact working code that will perform differential expression analysis on your data? Please do not expect volunteers here to do your work for you.

ADD REPLY • link 5.3 years ago by Ram 44k

1

Entering edit mode

Thank you, I will do it

ADD REPLY • link 5.3 years ago by mathavanbioinfo ▴ 80

0

Entering edit mode

How to solve this

sigmatrix <- project.NormData$E[probeset.list,]
Error in project.NormData$E[probeset.list, ] : 
  invalid subscript type 'list'

ADD REPLY • link updated 3.7 years ago by Ram 44k • written 5.3 years ago by mathavanbioinfo ▴ 80

0

Entering edit mode

probeset.list should be a character or integer vector, not a list object. If character, it should contain rownames of your object being subset; if integer, these should correspond to row indices of your object being subset.

Please take some time to review subsetting in R.

ADD REPLY • link 5.3 years ago by Kevin Blighe 88k

0

Entering edit mode

I tried but not getting anything. Can you please tell me how to convert it into character or integer vector.

ADD REPLY • link 3.7 years ago by smrutimayipanda ▴ 20

0

Entering edit mode

I tried

Did you google "convert list to vector"?

ADD REPLY • link 3.7 years ago by Ram 44k

0

Entering edit mode

yes I used unlist() for that but still it is showing error

ADD REPLY • link 3.7 years ago by smrutimayipanda ▴ 20

0

Entering edit mode

Surely, it is not the same error. Can you show how you used your unlist(), as well as the class() of the objects without and with the unlist operation?

ADD REPLY • link 3.6 years ago by Ram 44k

0

Entering edit mode

x <- unlist(probeset.list)
sigmatrix <- project.bgcorrect.norm$E[x,]

This gives me error:

Error in project.bgcorrect.norm$E[x, ] : subscript out of bounds

class(x)
[1] "character"

ADD REPLY • link updated 3.6 years ago by Ram 44k • written 3.6 years ago by smrutimayipanda ▴ 20

0

Entering edit mode

What are the first few values of x?; what are the first few rownames of project.bgcorrect.norm$E?

ADD REPLY • link 3.6 years ago by Kevin Blighe 88k

0

Entering edit mode

  /media/mdrcubuntu/46B85615B8560439/microarray_text_files/SG18178614_257236333019_S001_GE1_1200_Jun14_1_2
[1,]                                                                                                14.741186
[2,]                                                                                                 5.824586
[3,]                                                                                                 5.675543
[4,]                                                                                                 6.442700
[5,]                                                                                                 6.339589
[6,]                                                                                                 6.392602
[7,]                                                                                                 8.494547
[8,]                                                                                                11.104732
[9,]                                                                                                 6.022043
[10,]                                                                                                 6.012297
[11,]                                                                                                 6.240512
[12,]                                                                                                 6.167124

ADD REPLY • link updated 3.6 years ago by Ram 44k • written 3.6 years ago by smrutimayipanda ▴ 20

0

Entering edit mode

Well, that does not look good. With these two lines of code, we are trying to subset the expression matrix with a character vector of probe IDs. For this to work, the rownames of the expression matrix has to already be set as probe IDs. You may have lost these IDs from the expression matrix in a previous step.

ADD REPLY • link 3.6 years ago by Kevin Blighe 88k

0

Entering edit mode

Sorry Kevin

I perused your answer here and I finished with something names gset with class of Expressionset but I don't know how to get normalized matrix for `` GEO accession through this code

Actually the ultimate goal is a survival Kaplan meier plot for this list of genes for this data sets

MAGEA4
SPP1
ALB
CCNE1
CXCL8
IL1R2
IL1B
IL11
OSM
GAL
CCL21
CCL4
LAMC2
CCL20
AQP9
ISG15
IL6
FOS
CXCL10
BIRC3
NR4A3
IFIT2
CLCF1
CYP1B1
GZMB
OR10J3
C5AR1
CCL2
ATG12
EGR1
TNFAIP3
CTSL
CD40
SLC7A7
ETV7
LTB
IL2RB
CEBPB
ITGB2
ZNF502
POLD4
DDX10

ADD REPLY • link 5.2 years ago by zizigolu ★ 4.3k

0

Entering edit mode

Hey, you can obtain the data like this:

library(Biobase)
library(GEOquery)

gset <- getGEO("GSE19417", GSEMatrix =TRUE, getGPL=FALSE)
if (length(gset) > 1) idx <- grep("GPL4372", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]

ADD REPLY • link 5.2 years ago by Kevin Blighe 88k

0

Entering edit mode

Thank you your code gave me

Is this matrix normalized now? I am not sure which accession are in rows :( and how I could gene symbol instead

ADD REPLY • link 5.2 years ago by zizigolu ★ 4.3k

1

Entering edit mode

Yes, it seems to be normalised.

They are Rosetta IDs - you can obtain an ID mapping here: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL4372 (download the 'Annotation SOFT table...')

ADD REPLY • link 5.2 years ago by Kevin Blighe 88k