I guess most people here are using open source/command line tools for NGS analysis. Nonetheless I am wondering if anyone has experience with any of the major commercial analysis suites and what your opinions are on them.
Undoubtedly there is a loss of flexibility and transparency when we move to an analysis suite but they have their obvious advantages also.
I am not so much concerned with proprietary algorithms etc so much as I am thinking of usability, stability, flexibility, etc...
I have been very impressed by the CLC Genomics Workbench and I have been recommending it to most of the life scientists regardless of their computational background.
It is particularly handy to study small genomes (10-100Megabases) that can be processed easily on a more powerful desktop. For many tasks I find it amazingly potent way to save time - rule out certain hypothesis, explore potential alternatives.
For example right now I have a dataset of 9 million 70bp reads on a bacteria of 4 Megabases. I can assemble that with three four clicks, or align to one reference or another, or blast the largest contigs - each task takes just a a few clicks and probably about five minutes of my time to set all that up - then I can do something else while it runs. If something good comes up I can move onto a more thorough process. Most impressively it does both the assembly and mapping faster and using less resources than most alternatives.
To use it I don't need to remember how to run Velvet or some other assembler, how to index the genome for mapping, pick the right parameters, etc. Compared to the overall cost of sequencing or that of my time the a license for CLC is very cheap.
If everyone could just use a tool like CLC and only asked for questions that cannot be solved by it - most people's research would be a lot farther ahead it would allow people to spend more of their brain power on complex tasks rather than simple things.
I do recommend it - but Galaxy has a different use case - it is not realistic to expect that all scientists would be able use the central Galaxy server for their compute intensive tasks - deploying their own version would require system admin skills.
Two other options for Galaxy are using it on the cloud (http://galaxyproject.org/wiki/Admin/Cloud) and using one of the increasing number of Galaxy servers at other institutions (http://galaxyproject.org/wiki/PublicGalaxyServers). However, the cloud does require some sys admin, and public servers may not have what you want (e.g., I don't know of any public Galaxy server with velvet on it).
pick the right parameters - important point I see here. As you don't need to pick right parameters, you run all the analyses with default parameters and you have no clue what in the end is your result. You cannot do all your stuff with default parameters. CLC is very cheap - I don't agree, >3k per single user licence per year doesn't seem cheap for me. talking about functionality i.e. assembly, CLC builds contigs. so far it's unable to scaffold them into supercontigs using PE information. for me Velvet or SOAPdenovo perform much better, plus you have control over parameters used.
what I really meant is that I don't need to remember a lot of boilerplate information - did I index in colorspace or not, should the index go first or second in the query - do I need to transform his or that for it to work - etc - (also note that CLC licensing is not yearly) - at the same time I did make the point that it is very much possible that it is not appropriate for many use cases - that's why we have some many tools.
It depends on what you want to do? if you interested in targeted sequencing suite, the best one is NextGENe from softgenetics. It used a rigorous alignment algorithm similar to BLAT that picks up indels larger than any commercial algorithm. It is extermly user freindly. Powerful reporting tools with some flexibility if you can integrate them externally.
I do recommend it - but Galaxy has a different use case - it is not realistic to expect that all scientists would be able use the central Galaxy server for their compute intensive tasks - deploying their own version would require system admin skills.
i am surprised you don't recommend they use Galaxy for some of these tasks
Two other options for Galaxy are using it on the cloud (http://galaxyproject.org/wiki/Admin/Cloud) and using one of the increasing number of Galaxy servers at other institutions (http://galaxyproject.org/wiki/PublicGalaxyServers). However, the cloud does require some sys admin, and public servers may not have what you want (e.g., I don't know of any public Galaxy server with velvet on it).
pick the right parameters
- important point I see here. As you don't need to pick right parameters, you run all the analyses with default parameters and you have no clue what in the end is your result. You cannot do all your stuff with default parameters.CLC is very cheap
- I don't agree, >3k per single user licence per year doesn't seem cheap for me. talking about functionality i.e. assembly, CLC builds contigs. so far it's unable to scaffold them into supercontigs using PE information. for me Velvet or SOAPdenovo perform much better, plus you have control over parameters used.what I really meant is that I don't need to remember a lot of boilerplate information - did I index in colorspace or not, should the index go first or second in the query - do I need to transform his or that for it to work - etc - (also note that CLC licensing is not yearly) - at the same time I did make the point that it is very much possible that it is not appropriate for many use cases - that's why we have some many tools.