How to subset ExpressionSet based on vector of sample names
0
0
Entering edit mode
5.5 years ago
nkabo ▴ 80

I have an ExpressionSet object composed of 37 samples for HCC and cirrhosis situation and I would like to subset it according to names that I specified in two vectors. As a result, I want to have 2 expression arrays (one set is for HCC and other is for cirrhosis) after I subset this ExpressionSet. In order to subset the ExpressionSet, I have tried several methods but I could not get the samples and features at the same time.

expset_forall is ExpressionSet and names_HCC and names_cirr are the character vectors containing the names of samples.

This is expset_forall:

ExpressionSet (storageMode: lockedEnvironment)

assayData: 20962 features, 37 samples 

protocolData
  sampleNames: GSM437457.CEL.gz GSM437458.CEL.gz ... GSM437493.CEL.gz (37 total)...

I have tried:

eset_forHCC = expset_forall[, sampleNames(expset_forall) %in% names_HCC]

it gives error of "incorrect number of dimensions"

then I tried:

eset_forHCC= exprs(expset_forall[expset_forall@protocolData$sampleNames==names_HCC,]
dim(eset_forHCC)
[1]  0 37

At last, I tried to subset it by reaching via pData:

levels(pData(expset_forall)$sampleNames)

it gives "NULL"

As eset_forHCC, I expect the output:

ExpressionSet (storageMode: lockedEnvironment)

assayData: 20962 features, 17 samples

element names: exprs, se.exprs

protocolData

sampleNames: GSM437458.CEL.gz GSM437459.CEL.gz ... GSM437493.CEL.gz (17 total)

As eset_forcirr, I expect the output:

ExpressionSet (storageMode: lockedEnvironment)

assayData: 20962 features, 17 samples

element names: exprs, se.exprs

protocolData

sampleNames: GSM437460.CEL.gz GSM437459.CEL.gz ... GSM437491.CEL.gz (17 total)
R ExpressionSet subset Bioconductor • 2.4k views
ADD COMMENT
1
Entering edit mode

Once you get the ExpressionSet in a data frame object you can try to perform subset() a base function or filter() from dplyr package.

ADD REPLY
0
Entering edit mode

Does:

eset_forHCC= exprs(expset_forall)[,sampleNames(expset_forall) %in% names_HCC]

not work?

ADD REPLY
0
Entering edit mode

Thank you for your reply, it works but it gives a matrix I should have an ExpressionSet.

ADD REPLY
0
Entering edit mode
eset_forHCC = expset_forall[, sampleNames(expset_forall) %in% names_HCC]

This is correct and recommended way to get the subset. Could you recheck it?

Also, see that the output of sampleNames(expset_forall) %in% names_HCC is as intended.

ADD REPLY
0
Entering edit mode

Thank you for your reply, I checked it again and it works fine but it gives an expression matrix, I would like to have it as ExpressionSet.

ADD REPLY

Login before adding your answer.

Traffic: 1758 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6