Hi!
There are actually 2 similar situations for the same question, but it might be better to explain the design of each first:
With a small cohort of 16 samples (from patients), we would like to compare the output of our standard variant calling pipeline (fully automated, by IonTorrent) with another pipeline (i.e. gatk best practices); and
On this other experiment, we have established cell lines from tissue samples of different patients. Additionaly, we collected fresh frozen tissue and FFPE samples for those same patients. Now, by sequencing a panel of genes, we would like to observe the similarities and differences between them;
All samples have already been sequenced, aligned, and annotated (.vcf files). Thus, the question is:
What would be the best way to analyze/output such comparisons?
Maybe using Venn diagram? But would there be a more robust, better-looking way of representing the data? Is there a package that one can use which provides a nice output?
Thanks!
Venn gets clumsy when having many groups, check out so-called
UpSet
plots.Thanks, @ATpoint. I see there is even a handful of comparisons between these two plot methods. I will definitely try to apply it to the analysis. I think it will be useful to show how many elements (variants) are common between the groups. I was also thinking about showing the elements that are unique for each group - if it is just a few dozens, a table will do; but I am afraid if there are some hundreds (or thousands). Any extra idea on how to show it? In any case, thanks a lot for the suggestion!