Question

A function for having a multiway intersection

0

Entering edit mode

7.1 years ago

jivarajivaraj ▴ 50

Hi,

I have time points , I have differentially expressed genes for each possible combination of these time points too. So that etc. How I can have a code to extract common genes from each possible pairwise combinations of these lists?

R Venn diagram intersection • 2.3k views

ADD COMMENT • link 7.1 years ago by jivarajivaraj ▴ 50

1

Entering edit mode

getting intersect between two lists of genes in R
venn diagram for gene lists using R

ADD REPLY • link 7.1 years ago by GenoMax 152k

0

Entering edit mode

Hello jivarajivaraj!

It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/4533

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY • link 7.1 years ago by Pierre Lindenbaum 166k

Ram · Answer 1 · 2018-06-20

0

Entering edit mode

7.1 years ago

Jean-Karim Heriche 27k

In R, just use the intersect() function as in

intersect(2h, c(4h, 6h, ...))

where xh is replaced by the corresponding gene set.

Edit: I realized that you may actually mean 2h vs 4h then 2h vs 6h and so on, in which case, just iterate intersect() over all relevant combinations.

ADD COMMENT • link 7.1 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

Thank you. The number of genes in each of these lists are not equal. For example there are 630 common genes between h2 and h4 but i have 1143 genes in h2 list and 768 genes in h4 list , so h2 vs h4 would be 630/1143 and h4 vs h2 would be 630/768. That would would be great if i have a matrix or plot of all possible pairwise combinations of my 8 time points.

intersect(2h, c(4h, 6h, ...))

list()

ADD REPLY • link updated 7.1 years ago by Ram 45k • written 7.1 years ago by jivarajivaraj ▴ 50

3

Entering edit mode

What you're looking for is essentially a matrix of similarity between sets. A typical measure is the Jaccard index which can be easily computed like this:

jaccard_index <- function(x,y) {
    intersect <- length(intersect(x,y))
    similarity <- intersect / (length(x) + length(y) - intersect)
    return(similarity)
}

You can easily compute the similarity matrix with for loops:

sets <- list(2h, 4h, 6h, 8h, 10h, 12h, 14h, 16h)
S <- matrix(NA, nrow = length(sets), ncol = length(sets)
for(i in 1: length(sets)) {
    for(j in i:length(sets) { # matrix is symmetric so only compute the top triangular part
        S[i,j] <- jaccard_index(sets[i],sets[j])
        S[j,i] <- S[i,j]
    }
}

ADD REPLY • link 7.1 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

sorry says that

Error: object 'i' not found

I have a vector of characters for each time point putting them in list and i run your function

> str(sets)
List of 9

>