Entering edit mode
7.9 years ago
Matteo Schiavinato
★
3.6k
Hi all,
It has been almost 1 month since I started five pindel2vcf runs to convert the output of Pindel, which took more than 1 month itself to finish. I am using it on whole-genome results, so the amount of data is considerably high. However, I did not find anywhere written that the program is not meant to be used for whole-genome analyses.
My command was:
time { pindel -f reference.fa --config-file filename.config --output-prefix whatever --chromosome ALL --number_of_threads 12 --max_range_index 4 --report_inversions --report_duplications --report_long_insertions --report_breakpoints --report_close_mapped_reads --min_inversion_size 50 &> STDERR/pindel.stderr; } &> TIME/pindel.time & disown
Except for the 12 threads (couldn't help it), is it improvable in speed by adding something that I am not aware of? Am I wrong on using it for whole-genome analyses?
My pindel2vcf command was:
time { pindel2vcf -p pindel_output_file -r reference.fa -R name_and_version -d date -v FINAL/deletions.vcf -mc 10 -he 0.2 -ho 0.8 --both_strands_supported --min_supporting_reads 4 --max_supporting_reads 50 &> STDERR/deletions.vcf.stderr; } &> TIME/deletions.vcf.time & disown
Is this also improvable?
How large is the pindel output file? Does your CPU support at least 12 parallel threads? Did you have free RAM at all times? If you processed the chromosomes individually, then you could have pindel2vcf'd output files in parallel. I've never used pindel so I can't really comment about your command line arguments..
The resources are not a problem, I'm working on a quite big cluster with many cores and a lot of memory always available, I think the problem is more related to Pindel itself.
I am having the same problem with Pindel. How did you sort it out?
I pre-selected the reads that could have been generating something. For example: I knew that I was looking for a rearrangement on one scaffold so I pre-selected the reads mapping on that one and the ones that had one mate on that and another mate on a different scaffold.
However, since they claim that you could do whole-genome rearrangement studies, what I did was a workaround. You can't always know a priori what you're looking for and where.