Is Pindel slow for everyone, or should I review my command?
0
1
Entering edit mode
7.9 years ago

Hi all,

It has been almost 1 month since I started five pindel2vcf runs to convert the output of Pindel, which took more than 1 month itself to finish. I am using it on whole-genome results, so the amount of data is considerably high. However, I did not find anywhere written that the program is not meant to be used for whole-genome analyses.

My command was:

time { pindel -f reference.fa --config-file filename.config --output-prefix whatever --chromosome ALL --number_of_threads 12 --max_range_index 4 --report_inversions --report_duplications --report_long_insertions --report_breakpoints --report_close_mapped_reads --min_inversion_size 50 &> STDERR/pindel.stderr; } &> TIME/pindel.time & disown

Except for the 12 threads (couldn't help it), is it improvable in speed by adding something that I am not aware of? Am I wrong on using it for whole-genome analyses?

My pindel2vcf command was:

time { pindel2vcf -p pindel_output_file -r reference.fa -R name_and_version -d date -v FINAL/deletions.vcf -mc 10 -he 0.2 -ho 0.8 --both_strands_supported --min_supporting_reads 4 --max_supporting_reads 50 &> STDERR/deletions.vcf.stderr; } &> TIME/deletions.vcf.time & disown

Is this also improvable?

Pindel Variant Structural Program Detection • 2.5k views
ADD COMMENT
0
Entering edit mode

How large is the pindel output file? Does your CPU support at least 12 parallel threads? Did you have free RAM at all times? If you processed the chromosomes individually, then you could have pindel2vcf'd output files in parallel. I've never used pindel so I can't really comment about your command line arguments..

ADD REPLY
0
Entering edit mode

The resources are not a problem, I'm working on a quite big cluster with many cores and a lot of memory always available, I think the problem is more related to Pindel itself.

ADD REPLY
0
Entering edit mode

I am having the same problem with Pindel. How did you sort it out?

ADD REPLY
0
Entering edit mode

I pre-selected the reads that could have been generating something. For example: I knew that I was looking for a rearrangement on one scaffold so I pre-selected the reads mapping on that one and the ones that had one mate on that and another mate on a different scaffold.

However, since they claim that you could do whole-genome rearrangement studies, what I did was a workaround. You can't always know a priori what you're looking for and where.

ADD REPLY

Login before adding your answer.

Traffic: 2661 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6