How to select columns from the dataframe based on variables from another dataframe
1
3
Entering edit mode
7.9 years ago
akhattri ▴ 50

I have a dataframe in which each column is microarray data of a sample. I want to select columns based on another dataframe (df2). I was trying to do it using select function of dplyr but somehow it does not work.

df1

>head(df1)
probeID          WT.GS1    WT.GS2   WT.GS3    KO.GS1  KO.GS2   KO.GS3
100001_at         11.5          5.6         69.1         15.7         36.0         42.0
100002_at         20.5         32.4         93.3         31.8         14.4         22.9
100003_at         72.4         89.0         79.2         80.5        130.1         86.7
......

df2

>head(df2)
Header      SampleType
1 KO.GS1       AR.R
2 KO.GS2       AR.R
3 WT.GS1       AR.R
4 WT.GS2      BL.PD
5 WT.GS3      BL.PD

Let's say I want to select columns from the df1 which match with variables in df2$Header. I was trying to:

>df1 %>% select(dput(as.character(df2$Header)))
R dplyr • 23k views
ADD COMMENT
1
Entering edit mode

Have a look at the examples section of the documentation of the select function, you will see how to use one_of to solve your problem. By the way, you may be interested in the SummarizedExperiment objects from Bioconductor, with which you can bind together the data (df1 in your example) and the metadata (df2 in your example).

ADD REPLY
1
Entering edit mode
7.9 years ago
akhattri ▴ 50

Thanks!! I am looking into SummarizedExperiment. Using dplyr:

>df1 %>% select(one_of(dput(as.character(df2$Header))))

or

>list.df2 <- dput(as.character(df2$Header))
>df1 %>% select(one_of(list.df2))

worked

ADD COMMENT

Login before adding your answer.

Traffic: 3025 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6