What is currently the best visual and interactive genotype matrix exploration tool for large genotype matrixes, say the 1000 human genomes VCF?
So 100M plus variants, 1000+ samples, raw uncompressed VCF file size 1TB+.
One requirement is that it should do all kinds of filtering that bcftools (view) does:
http://www.htslib.org/doc/bcftools.html
But BCFTools does not meet the interactive and visual requirements. BCFTools is only interactive for small VCF files or when you use the tabix index for looking up a small region.
Another requirements if that the filtering is visual and interactive, like for example with a small genotype matrix in Excel. (I know bad idea but at least Excel interactive, visual and biologist friendly).
With interactive I mean that a filter criteria can be adjusted and you reall-time get back your updated genotype matrix. Even for complex queries were the full 100M+ variants for all 1000+ samples should be scanned the tool should be interactive.
Does something like this already exist? If so which tools? If not why not?
I tried installing on Ubuntu 16.04, and get
home/wouter/bin/jvarkit/src/main/java/com/github/lindenb/jvarkit/tools/vcfviewgui/VcfStage.java:74: error: package javafx.beans.property does not exist
, followed by many more of similar errors each pointing to a different package that doesn't exist.Before that error the following was printed:
I'm not at all familiar with java-stuff so if you could point me in the right direction, that would be great :p
@WouterDeCoster don't use openjdk (incomplete) but the official oracle java : http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
Right, that did the trick ;-)
I'm going to play a bit with it, for sure looks great. Thanks!
thanks I'm still working on it, I'll happy to get any feedback :-)
I was a bit confused what the 'main screen' "set location of all frames to" option would do, having been too quick in the manual. In hindsight it's clear, but I wouldn't have wondered about it if the text would have been something like "change genomic location of all frames to", with maybe an example "chr17:32232-32932" already entered. Or a gene name, it's not immediately obvious which input is expected.