I have a spreadsheet with columns representing genes and rows being samples, populated with binary values to indicate the presence of a mutation in a gene for a specific sample. It looks like this.
I've been asked to visualise the intersections for each gene (i.e. IGHV3-23 and TBL1XR1 are both mutated in 3 samples, and CCR6 shares a sample with both of these genes), as a proportionally scaled venn diagram. Having taken the time to verify that this is not realistic for a publication graph, I'm now looking for other ways to visualise the data.
Heatmaps have been suggested, but given the number of samples and the sparsity of the matrix I'm not sure how well that approach will work. I've also seen people suggesting UpSet on similar forum posts, but as far as I can see UpSet is a tool as opposed to a method for generating static plots.
Any suggestions on ways to visually summarise this data would be appreciated.
For this, it sounds like you could develop some sort of oncoPrint? Obviously originally developed to view mutation frequencies in cancer, they make for a nice way to visualise frequencies of anything in anything.
I guess a heatmap would indeed provide the easiest visualization. The colour of a column / row pair can be proportional to the number of samples in which both genes are either mutated or wt.