Find the combination of a perticular elements with its other column element
1
1
Entering edit mode
5.4 years ago
xxxxxxxx ▴ 20

My file is like this-

  Pcol       Mcol
    P1      M1,M2,M5,M6
    P2      M1,M2,M3,M5
    P3      M4,M5,M7,M6

I want to find the combination of Mcol elements along with Pcol.

Expected output-

Pcol Mcol        
P1  M1,M2        
P2  M1,M2        
P1  M1,M5        
P2  M1,M5        
P1  M1,M6        
P1  M2,M5        
P2  M2,M5        
P1  M2,M6        
P1  M5,M6        
P3  M5,M6        
P2  M1,M3        
P2  M2,M3        
P3  M4,M5        
P3  M4,M7        
P3  M4,M6        
P3  M7,M6

I have tried this-

x <- read.csv("file.csv" ,header = TRUE, stringsAsFactors = FALSE)
xx <- do.call(rbind.data.frame, 
              lapply(x$Gcol, function(i){
                n <- sort(unlist(strsplit(i, ",")))
                t(combn(n, 2))
              }))

But it only gives output of combination not the Pcol elements.

R • 1.1k views
ADD COMMENT
0
Entering edit mode

xxxxxxxx

It is bad etiquette to ask the same question multiple times.

ADD REPLY
2
Entering edit mode
5.4 years ago
zx8754 12k

Your code was very close we need to add Pcol in the output, see below:

# example data
x <- read.table(text = " Pcol       Mcol
    P1      M1,M2,M5,M6
    P2      M1,M2,M3,M5
    P3      M4,M5,M7,M6", header = TRUE, stringsAsFactors = FALSE)


res <- do.call(rbind,
        apply(x, 1, function(i){
          n <- sort(unlist(strsplit(i[ 2 ], ",")))
          mycomb <- t(combn(n, 2))
          data.frame(Pcol = unname(i[ 1 ]),
                     Mcol = paste(mycomb[, 1], mycomb[, 2], sep = ","))
          })
        )

#    Pcol  Mcol
# 1    P1 M1,M2
# 2    P1 M1,M5
# 3    P1 M1,M6
# 4    P1 M2,M5
# 5    P1 M2,M6
# 6    P1 M5,M6
# 7    P2 M1,M2
# 8    P2 M1,M3
# 9    P2 M1,M5
# 10   P2 M2,M3
# 11   P2 M2,M5
# 12   P2 M3,M5
# 13   P3 M4,M5
# 14   P3 M4,M6
# 15   P3 M4,M7
# 16   P3 M5,M6
# 17   P3 M5,M7
# 18   P3 M6,M7

Then to address your other question about adding frequency, try:

res$Freq <- table(res$Mcol)[ res$Mcol ]
#    Pcol  Mcol Freq
# 1    P1 M1,M2    2
# 2    P1 M1,M5    2
# 3    P1 M1,M6    1
# 4    P1 M2,M5    2
# 5    P1 M2,M6    1
# 6    P1 M5,M6    2
# 7    P2 M1,M2    2
# 8    P2 M1,M3    1
# 9    P2 M1,M5    2
# 10   P2 M2,M3    1
# 11   P2 M2,M5    2
# 12   P2 M3,M5    1
# 13   P3 M4,M5    1
# 14   P3 M4,M6    1
# 15   P3 M4,M7    1
# 16   P3 M5,M6    2
# 17   P3 M5,M7    1
# 18   P3 M6,M7    1
ADD COMMENT

Login before adding your answer.

Traffic: 1791 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6