I have partitioned the gene complement of 21 strains of a particular species of Strep into core, dispensable, and unique sets, but I'm at a loss as how best to represent these data. I originally thought Venn diagram illustrating the total sizes of each partition, but wasn't satisfied that I was able to accurately portray my data in this fashion. I did some searching and found a very interesting method to capture a more appropriate representation (Fig. A, B and C), something the authors called a "flowerplot" 1 (a new one on me). I've been trying to re-create this sort of visualization manually using matplotlib in Python 3.3 as there doesn't seem to be any package that exists to provide comparable output in a more automatic way, but haven't had much success.
What I've tried:
- Adding Ellipse patches to Cartesian axes. Not satisfactory because the ellipse patches'
xy
argument centers the ellipse at (x,y), where I'd need some way to rotate the ellipse about the origin to achieve the desired effect. - Adding Ellipse patches to polar axes. This was a complete mess; I can get one good ellipse but can't place any others reliably (most likely due to my lack of understanding of using polar coordinates!).
Additionally, using Ellipse patches might not end up being the best move, since I'll need to annotate each ellipse with, at the very least, strain ID and count information.
Is anyone familiar with a way to either effectively visualize these data, or perhaps duplicate the plots I linked?
1 Sugawara, et al. (2013). Comparative genomics of the core and accessory genomes of 48 Sinorhizobium strains comprising five genospecies. Genome Biology 2013, 14:R17. doi:10.1186/gb-2013-14-2-r17 or http://genomebiology.com/2013/14/2/R17.
The flowerplots you link to would be better represented as a simple table with a "sum" or "total" column for the genus/species. Then you could sort them in a meaningful order.