Hello,
I'm very new to R, and I have a large data set of the presence/absence of enriched genes in the form of lists of gene names for 10 different cell lines. I am trying to display the overlap in significantly enriched genes across the different cell lines using the UpSetR package.
Currently my data looks like this, but each list is between 200 and 900 genes long
current data format
Whereas I want to display the data in a binary matrix format to indicate presence/absence of each unique gene in the overall set in each individual cell line list.
desired data format
I have been able to compile a reference column of all the unique genes across the three lists, however, I am now very stuck on how to use that reference list to convert the different gene lists into a binary format, and was wondering if someone could point me in the right direction to solving this.
Many thanks in advance for any help!
Hi there, have you found a solution for this task? I am in the same situation