I am trying to obtain normalized expression values for Affymetrix microarray data that we have using bioconductor. The script I have is mostly there, however, I am irked that the row header describes the samples by the file names of the CEL files, rather than the sample names designated in Targets.txt
.
Here is an excerpt of my Targets.txt
file:
# file for use by limma and affylmGUI. the targets are per-condition and per-time-point.
Name FileName Target
feh.rep1 FH1.CEL feh
feh.rep2 FH2.CEL feh
feh.rep3 FH3.CEL feh
...
Here is the R script:
library(affy)
library(gcrma)
# phenotype data
pd <- read.AnnotatedDataFrame("Targets.txt", header=T)
affy.data <- ReadAffy(filenames=pd$FileName, phenoData=pd)
# gene expression data, normalized by GCRMA
eset <- gcrma(affy.data)
write.exprs(eset, "cs_hm_feh-expression-gcrma-2011-01-12.tsv", sep="\t", row)
I have tried using
affy.data <- ReadAffy(filenames=pd$FileName, phenoData=pd, sampleNames=row.names(pd))
however, that was unsuccessful. Further investigation shows that row.names
isn't actually getting anything at all
> row.names(pd)
NULL
which I find perplexing, given that the object shows it has row names, which are exactly what I want (and expected) as my sample labels in the final CSV table:
> pd
An object of class "AnnotatedDataFrame"
rowNames: feh.rep1, feh.rep2, ..., cs.8d.rep3 (27 total)
varLabels and varMetadata description:
FileName:
Target:
Any help is appreciated, as I wield R fairly ignorantly and can not figure my way through this one seemingly simple task.
Thanks, Brad! Excellent suggestion reading the docs. I was reading the wrong ones (for
ReadAffy
andread.AnnotatedDataFrame
). Worked perfectly usingsampleNames=sampleNames(pd)
. I'm still surprised I had to manually specify this, but I probably made the wrong assumptions and/or abused the method calls.Chris, agreed that it seems like it should work without manually specifying it; that's what made me so unsure I was answering your question correctly. Glad that it worked.