subsetting limma EList in a dataframe
1
1
Entering edit mode
6.5 years ago

Hi everyone,

I downloaded this dataset from Array Express, I followed the limma user guide. I did not find the annotation packages but a file with the probe names and entrez gene. I want to extract from the EList a dataframe with the probe name and the expression value, this will be much more suitable for me. how I can do this?

URL <- "https://www.ebi.ac.uk/arrayexpress/files/E-MTAB-1781/"
SDRF.file <- "E-MTAB-1781.sdrf.txt"
Data.file <- "E-MTAB-1781.raw.1.zip"
download.file("https://www.ebi.ac.uk/arrayexpress/files/E-MTAB-1781/E-MTAB-1781.sdrf.txt", SDRF.file)
download.file("https://www.ebi.ac.uk/arrayexpress/files/E-MTAB-1781/E-MTAB-1781.raw.1.zip", Data.file)
unzip(Data.file)
SDRF <- read.delim("E-MTAB-1781.sdrf.txt",check.names=FALSE,stringsAsFactors=FALSE)
x <- read.maimages(SDRF[,"Array Data File"], source="agilent", green.only=TRUE, other.columns="gIsWellAboveBG")

y <- backgroundCorrect(x, method="normexp")
y <- normalizeBetweenArrays(y, method="quantile")
Control <- y$genes$ControlType==1L
IsExpr <- rowSums(y$other$gIsWellAboveBG > 0) >= 4
yfilt <- y[!Control & IsExpr, ]
names(yfilt$genes)

thank you in advance for your help,

Salvo

R Limma Microarray • 2.2k views
ADD COMMENT
2
Entering edit mode
6.5 years ago

Care Salvo, buonasera, ecco qua la risposta:

To see what is within any object in R, 2 useful functions to use are str() and summary():

summary(yfilt)
        Length Class      Mode     
E       191555 -none-     numeric  
targets      1 data.frame list     
genes        5 data.frame list     
source       1 -none-     character
other        1 -none-     list     


yfilt$E[1:5,1:5]
     US22502540_252038210041_1_1 US22502540_252038210165_1_4
[1,]                    8.637449                    8.700608
[2,]                   10.442242                   10.184019
[3,]                    8.696243                    8.865778
[4,]                    9.774359                   10.186498
[5,]                    8.965035                    9.074906
     US22502540_252038210040_1_3 US22502540_252038210040_1_4
[1,]                    8.705305                    8.646266
[2,]                   11.342835                   10.393071
[3,]                    8.639361                    8.596739
[4,]                   10.002712                    9.921155
[5,]                    9.069770                    9.071458
     US22502540_252038210087_1_4
[1,]                    8.605702
[2,]                   10.916877
[3,]                    8.742966
[4,]                    9.978770
[5,]                    8.711117


yfilt$genes[1:5,]
   Row Col ControlType         ProbeName    SystematicName
12   1  12           0 UKv4_A_23_P314216 UKv4_A_23_P314216
13   1  13           0 UKv4_A_24_P126851 UKv4_A_24_P126851
14   1  14           0  UKv4_A_32_P77762  UKv4_A_32_P77762
15   1  15           0  UKv4_A_23_P71864  UKv4_A_23_P71864
16   1  16           0  UKv4_A_32_P48198  UKv4_A_32_P48198

So, The E object contains the normalised expression values, and the genes object contains the rownames.

nrow(yfilt$genes)
[1] 38311

nrow(yfilt$E)
[1] 38311

Ci vediamo dopo,

Kevin

ADD COMMENT
1
Entering edit mode

Thank you very much for your help,

a presto,

Salvo

ADD REPLY

Login before adding your answer.

Traffic: 1935 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6