Hi All, I intersected to data sets and I have a tabulated file with two columns like:
AA BB
BB CC
CC DD
BB AA
EE FF
FF GG
GG HH
II JJ
JJ II
II KK
... ...
and I would like to convert "one-to.one interactions" in clusters considering that AA interacts with BB, BB with CC and CC with DD (so AA, BB, CC and DD form a cluster). Similarly EE, FF, GG, HH form another cluster but none of these elements interact with elements of the first cluster and so on.
I would like to obtain something like
The data represents a graph in edge list format, i.e. each line is an edge of the graph specifying the two nodes that are connected. What you call clusters seems to be the connected components of this graph. So read the data into a graph structure then extract the connected components, e.g. in R with the igraph package, something like this (untested):
edge.list <- as.matrix(read.table("edge_list.txt",...)) # read the file as appropriate, turn data into a two-column matrix for use by igraph
G <- graph_from_edge_list(edge.list, directed = FALSE)
clusters <- components(G)
Question : If somewhere in the file you have BB GG, you want to get a single cluster (AA, BB, CC, DD, EE, FF, GG and HH) ?
I don't understand this line
Why is this tagged as a software error question ?