[R] - subset two dataframes
2
0
Entering edit mode
6.9 years ago
Spacebio ▴ 200

I have two columns inside of a dataframe called df1 looking like this:

  V1        V2
GENE A     GENE E
GENE B     GENE D
GENE C     GENE A
GENE D     GENE B
GENE E     GENE C

and another dataframe called df2like this:

Name       ID     Symbol
GENE A    1254    AKT
GENE B    1879    POU5F1
GENE C    5689    EGR1
GENE D    2385    JUN
GENE E    5687    MYC

The output I would like to have is the following:

NameSource       SourceID        NameTarget       TargetID
AKT                1254             MYC             5687
POU5F1             1879             JUN             2385
EGR1               5689             AKT             1254
JUN                2385             POU5F1          1879
MYC                5687             EGR1            5689

I tried with the following syntax:

genes <- df1[which(df1$V1, df2$Symbol), ]

and with:

genes <- df1$V1 %in% df2$Symbol

But for some reason I cannot get the output I am expecting. Any ideas?

R • 3.9k views
ADD COMMENT
1
Entering edit mode
6.9 years ago
options(stringsAsFactors = F)
df1=read.csv("df1", header=T, sep="\t")
df2=read.csv("df2", header=T, sep="\t")
df3=as.data.frame(t(sapply(df1$V1, function (x) df2[grepl (x, df2$Name),])))[,c(3,2)]
df4=as.data.frame(t(sapply(df1$V2, function (x) df2[grepl (x, df2$Name),])))[,c(3,2)]
df5=cbind(df3,df4)
colnames(df5)=c("NameSource","SourceID","NameTarget","TargetID")
row.names(df5)=seq(1:length(df5[,1]))
> df5
  NameSource SourceID NameTarget TargetID
1        AKT     1254        MYC     5687
2     POU5F1     1879        JUN     2385
3       EGR1     5689        AKT     1254
4        JUN     2385     POU5F1     1879
5        MYC     5687       EGR1     5689
ADD COMMENT
0
Entering edit mode

Exactly like that! Thank you so much!

ADD REPLY
0
Entering edit mode

Please post link to SO solution, as well here, for future reference

ADD REPLY
0
Entering edit mode
6.9 years ago
Ram 44k

This is pure R, not much to do with bioinformatics. Check out merge or dpylr's join

ADD COMMENT
0
Entering edit mode

I did - not working.

ADD REPLY
1
Entering edit mode

Generally not working needs to be accompanied by code that does not work so others can help diagnose.

ADD REPLY
0
Entering edit mode

You did what? How do you figure it did not work?

ADD REPLY
0
Entering edit mode

Because I am not obtaining the output I wanted - asked in Stackoverflow and now the issue is solved. Closing this issue. Thanks for your tips.

ADD REPLY
0
Entering edit mode

Then post the solution you received here so this thread is not left hanging. Would help someone else out in future as well.

ADD REPLY

Login before adding your answer.

Traffic: 1794 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6