Hey,
Download the raw data CEL files and use the oligo
and limma
packages to process these. I do not believe you need any other files other than these [CEL files].
Here is how I processed similar arrays (your CEL files will be located in a directory called SampleFiles/):
source("http://bioconductor.org/biocLite.R")
require("limma")
require("oligo")
options(scipen=999)
targetinfo <- readTargets("Targets.txt", sep="\t")
CELFiles <- list.celfiles("SampleFiles/", full.names = TRUE)
project <- read.celfiles(CELFiles)
project.bgcorrect.norm.avg <- rma(project, background=TRUE, normalize=TRUE, target="core")
project.bgcorrect.norm.avg.Exons <- rma(project, background=TRUE, normalize=TRUE, target="probeset")
pdf("Output/ChipImageQC.pdf")
image(project)
dev.off()
pdf("Output/BoxPlotQC.pdf")
par(mar=c(5,5,5,5), cex=1, cex.axis=0.8, mfrow=c(2,1))
boxplot(project, which="all", transfo=log2, main="Raw chip fluorescent intensities", names=samplenames, las=2)
boxplot(project.bgcorrect.norm.avg, transfo=log2, main="Background-corrected, RMA normalised, log2 expression values\nAll probes", names=samplenames, las=2)
dev.off()
write.table(project.bgcorrect.norm.avg, "NormalisedCounts.GeneSummarised.tsv", sep="\t", quote=FALSE)
write.table(project.bgcorrect.norm.avg.Exons, "NormalisedCounts.ExonSummarised.tsv", sep="\t", quote=FALSE)
For annotation, see the working example here: A: Affymetrix Human Genome U133 Plus 2.0 Array
Also see the biomaRt vignette (section 4.1).