Lets say that you have whole genome variant calls for a number of individuals from a population within a species.
What are standard analysis / tools that you can always run out of the box after you get a a multi-sample vcf file?
- SNPEff -> Effect prediction, Ts/Tv ratio, Codon changes, Amino acid changes, Changes by chromosome
- SNPSifft - > filtering the SNPEff annotated data
- VCF-tools VCF-stats -> Number of private SNP per individual
- SNPRelate -> PCA plot, Phylogentic tree
I guess a lot of people are nowadays sequencing multiple individuals from a population and they all want to explore the variant calls / relation / contrast between individuals. Of course experiment, questions and populations are different but it is always good to minimize reinventing the wheel and maximizing usage of existing tools.
Things I am for example looking for are:
- plotting variant densities
- plotting LD plots
To be honest, I don't really trust anything written in perl, things become a big unreadable unmaintainable mess easily. If there is a Java (snpEff) or C++ alternative, I'll go with that one.
So, you select your tools by the language they're written in? Frankly, this is the most ridiculous thing I've heard in a long time. When it comes to writing your own code I fully understand that people have their preferences, but judging a tool by the language it's written in instead of how well it works, really. Especially if you say you're interested in tools that work out of the box, the language in which the tool was written should be of no concern to you. How good a tool is doesn't depend on the language it has been written in, but how good a job the people did that wrote the code. Maybe not every developer is capable of writing good Perl code, but our developers certainly are.
You know what they say about bad workmen...
Oh good grief, and I don't trust anything written in whitespace, that becomes unreadable even quicker :P The web-based VEP is a great tool, use it every day.