merge two data frame with different length
3
0
Entering edit mode
9.7 years ago
yasjas ▴ 70

Hello everyone,

I have two data frames which don't have the same length and one has more values than the other I would like to merge them and add an "NA" where there are missing values

Does anyone knows how to do that?

head(healthy_cell)
rep_names    TE_classes
1      HAL1          LINE
2      HAL1          LINE
3      HAL1          LINE
4      L1M4          LINE
5  (CACCC)n Simple_repeat
6      L1M4          LINE

x <-healthy_cell$rep_names
healthy_count= tapply(x,x,length)
df1 <- data.frame(healthy_count)

                healthy_count
(A)n                 92
(AATAG)n         1
(AATTG)n          1
(AGGGGG)n      5
(AGGTG)n          5
(AGTAG)n          1

# same for the second file(cancer)

y <- cancer_cell$rep_names
cancer_count=tapply(y,y,length)
df2 <- data.frame(cancer_count)
               cancer_count
(A)n                   89
(AAATG)n            1
(AATAG)n            1
(ACCG)n              1
(ACTG)n               2
(AGCTG)n             2

When I tried to merge the two data frames it gave me

merge(df1,df2,by=intersect(names(df1),names(df2)),all.df1=by,all.df2=by,all=T)
healthy_count cancer_count
1            92           89
2             1           89
3             1           89
4             5           89
5             5           89
6             1           89

However I would like to have all the rep_names as well

(AAATG)n    (ACTG)n   (AGCTG)n ,....
R • 35k views
ADD COMMENT
0
Entering edit mode

Thanks everyone for all the ideas it worked!!!!!!

ADD REPLY
4
Entering edit mode
9.7 years ago

What about

merge(df1, df2, by = "row.names", all = TRUE)

That will add a new column to the resulting data frame containing the row names of the original two data frames

ADD COMMENT
3
Entering edit mode
9.7 years ago

Have you tried full_join() from dplyr?

ADD COMMENT
3
Entering edit mode
9.7 years ago
Jimbou ▴ 960

cbind the rownames and then merge by the names.

df1 = cbind("id"=rownames(df1),df1)
df2 = cbind("id"=rownames(df2),df2)
merge(df1,df2,by="id",all=T)
ADD COMMENT

Login before adding your answer.

Traffic: 1940 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6