Hi All,
have anyone of you used Trowel (http://sourceforge.net/projects/trowel-ec/)? If yes, what were the results and experiences?
Thanks a lot!
Hi All,
have anyone of you used Trowel (http://sourceforge.net/projects/trowel-ec/)? If yes, what were the results and experiences?
Thanks a lot!
I'm the author of trowel. I will briefly introduce the strength of the trowel.
The results
Accuracy
The accuracy of sequencing error corrections can be varied depending on the read coverage, genome size, quality values, read length, and so on. According to the evaluation on different datasets (you can see the supplementary data of the original paper), we can observe that trowel has relatively high accuracy in terms of the standard metrics (specificity, sensitivity, precision, => F-score, Gain). Papers about sequencing error corrections normally do not evaluate the accuracy with paired-end alignments, which show the mis-corrections in longer sequence context, but we have done that. In addition, the other tools have not shown the before-and-after state changes of alignments. For example, some mis-corrections of a tool could lead to no-alignments for reads that had been mappable. trowel has preserved the alignment states for almost all reads.
You should keep eyes on the sensitivity of error corrections, which indicates how many the error corrections has performed or ignored by the tool. The sensitivity is highly dependent on the coverage-depth of a dataset and k-mer length. trowel only uses quality values of reads, meaning that trowel may correct sequencing errors for low-coverage datasets. However, due to fewer observations on the true sequences, the sensitivity of the error corrections would be lower for the low-coverage datasets. For lower coverage datasets, an alignment based method (overlap-consensus method) is the better option or you could reduce the k-mer size.
Runtime & memory
trowel is highly parallelized and only supports for the shared-memory model. Therefore, it would be better to apply trowel to a single high performance computer in which large amount of memory installed.
The future version of trowel would reduce the memory consumption (currently I am working on).
Given our experience, trowel works well up to genome size of 500 Mb. For supporting human-sized genomes, we are still working on.
If you have more questions, you can contact me by an email: euncheon.lim at tue.mpg.de .
I hope that this answer is useful for you.
Euncheon
Hello Euncheon
Trowel is pretty useful and I see that you have made some excellent updates. You said "The sensitivity is highly dependent on the coverage-depth of a dataset and k-mer length." What value of coverage do you consider as high or low coverage and large genome size as implemented for trowel? It will be nice to give a range (coverage and genome size) so as to make it easier to choose appropriate k between 11-15
Thank you
The low-coverage was initially meant for the datasets of average coverage less than 10.
In fact, the original Trowel can work with the coverage of 1 as long as there is a high quality template sequences. This fact is distinct from the conventional methods.
Large genome size is prohibited due to the current algorithm does not includes the parallel IO and memory efficient index. I have not tested Trowel 1 with genome size larger than 500 Mb.
For k mer, it is recommended to be used within a range of k 15-31 but not even k values in order to deal with palindromic k-mers.
If you are planning to evaluate Trowel, you should use Trowel 1. Trowel 2 contains so many experimental algorithms and I confirm that Trowel 2 has much less accuracy than Trowel 1. I have removed some rigorous algorithms from Trowel 2, leading to very bad sensitivity. I have no time to improve Trowel 1 and 2 right now due to my study plan. When everything become settled, I will be back to improve trowel.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Sorry for being late with my answer, it was useful for me. Thank you much!
Szandra
You're welcome!
I've just finished implementing a new version of trowel. The feature highlights are as follows:
The code is unavailable in public since it contains unpublished ideas. We only provide a binary working in 64-bit linux environment.