Hi guys,
I have a problem figuring out a way to do this in R. I have two data frames with genotyping data, df1 and df2. The table is really big, with hundreds of samples, which I have included small set of this data below. I have also included pieces of codes that I tried to make keys for both data columns to match with each other.
df1
:
Gene MAPK1 MAPK1 MAPK1 MAPK2 MAPK2 MAPK2
TYPE At GT AD At GT AD
Sample R23C R23C R23C T34Y T34Y T34Y
1 A G
2 A G
3 A G
rownames(seqnom)[c(1,2,3)]<-c("Gene","Type","Sample")
key.df1 <- paste(paste(df1["Gene",] , df1["Sample",],sep=":"))
Now, we have df2:
df2
:
Genes MAPK1 MAPK1 MAPK2 MAPK3 MAPK4 MAPK4
Names
Protein R23C R33Y T34Y R45C T44S S33D
Mutation
1.GT 0/0 0/0 0/0 0/0 0/0 0/0
1.AD 34,2 23,4 33,33 33,2 44,44 34,0
2.GT 0/1 0/1 0/1 0/0 0/1 0/1
2.AD 22,3 33,2 44,22 34,22 34,3 91,91
3.GT 1/1 1/1 1/1 1/1 1/1 1/1
3.AD 33,2 3,2 112,0 22,3 34,0 33,2
key.df2 <- paste(paste(df2["Gene Names",],df2["Protein Mutation",],sep=":"))
So using these two keys (key.df1 and key.df2) I would like to match with each other and if they match I want to paste the corresponding values in their respective columns. There are 100 samples (1:100) and all 100 samples have GT and AD values. Could you guys please help me fill the table below. I would really really appreciate it guys. Thank you.
Result:
Gene MAPK1 MAPK1 MAPK1 MAPK2 MAPK2 MAPK2
TYPE At GT AD At GT AD
Sample R23C R23C R23C T34Y T34Y T34Y
1 A 0/0 34,2 G 0/0 33,33
2 A 0/1 22,3 G 0/1 44,22
3 A 0/1 33,2 G 1/1 112,0
I found the table difficult to read. Are your table follow this format?
df1
:df2
:And you want to merge the two table given the Gene name and Protein mutation
Sorry, Yes you are right. Thank you for replying to my question.
Sorry, you are right. Thanks