Hi,
I have time points , I have differentially expressed genes for each possible combination of these time points too. So that etc. How I can have a code to extract common genes from each possible pairwise combinations of these lists?
Hi,
I have time points , I have differentially expressed genes for each possible combination of these time points too. So that etc. How I can have a code to extract common genes from each possible pairwise combinations of these lists?
In R, just use the intersect() function as in
intersect(2h, c(4h, 6h, ...))
where xh is replaced by the corresponding gene set.
Edit: I realized that you may actually mean 2h vs 4h then 2h vs 6h and so on, in which case, just iterate intersect() over all relevant combinations.
Thank you. The number of genes in each of these lists are not equal. For example there are 630 common genes between h2 and h4 but i have 1143 genes in h2 list and 768 genes in h4 list , so h2 vs h4 would be 630/1143 and h4 vs h2 would be 630/768. That would would be great if i have a matrix or plot of all possible pairwise combinations of my 8 time points.
intersect(2h, c(4h, 6h, ...))
list()
What you're looking for is essentially a matrix of similarity between sets. A typical measure is the Jaccard index which can be easily computed like this:
jaccard_index <- function(x,y) {
intersect <- length(intersect(x,y))
similarity <- intersect / (length(x) + length(y) - intersect)
return(similarity)
}
You can easily compute the similarity matrix with for loops:
sets <- list(2h, 4h, 6h, 8h, 10h, 12h, 14h, 16h)
S <- matrix(NA, nrow = length(sets), ncol = length(sets)
for(i in 1: length(sets)) {
for(j in i:length(sets) { # matrix is symmetric so only compute the top triangular part
S[i,j] <- jaccard_index(sets[i],sets[j])
S[j,i] <- S[i,j]
}
}
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
getting intersect between two lists of genes in R
venn diagram for gene lists using R
Hello jivarajivaraj!
It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/4533
This is typically not recommended as it runs the risk of annoying people in both communities.