I have multiple dataframes, each one represents a different experimental condition.
What I want is a list of:
a) the names of the genes which differ between the dataframes and b) the names of the genes which are common to the dataframes.
I know I can do this using the syntax listed here "http://www.cookbook-r.com/Manipulating_data/Comparing_data_frames/" but I am already using the VennDiagram package in R to illustrate visually how many genes are
a) common to and b) differ between dataframes.
It occurred to me that what I want is probably calculated by the VennDiagram package in order to be able to draw the venn diagram.
Does anybody know if it is and can it be extracted?
Thanks
There are a ton of ways to do this in base R, but the dplyr sntax for this is a bit more simple. You can use
dplyr::anti_join
to return rows in one data.frame that don't have matches in a column in another data.frame. If you want to get rows in a data.frame that have matches to a column in another data.frame, you can usedplyr::semi_join
.@rpolicastro. Thanks for the reply. That's exactly what i ended up doing and worked perfectly. However, unless I am missing something anti join only allows me compare two dataframes. When I started analysing the full dataset I ended up comparing 4 dataframes to each other which I didn't anticipate and involved quite a bit of code and is messy.
Hence the question regarding the VennDiagram package as it must be performing these types of caomparisions this to draw the Venn diagram.
You can also check this very nice solution implemented in R:
https://github.com/hms-dbmi/UpSetR
which enables complex comparisons and nice visual representations
This is a visualization tool that comes after OP understands the basics of their own dataset. As such, I don't think this qualifies as an answer.