HaplotypeCaller run time ?
0
0
Entering edit mode
3.5 years ago
quentin54520 ▴ 120

Hello all,

I'm sorry if my question is a bit naive but i try to run haplotypecaller on human WGS 30X.

I use GATK 4.2.0.0 I want to have an idea of the "normal" run time for such data.

I run happlotypecaller in gvcf mode and by interval (see the command after).

For interval i get the "good" sequence from the gatk bundle and i split the interval into 50 sub intervals by the tool gatk SplitIntervals (defaults parameters). Then i run haplotypecaller by intervals in parrallel for the 50 sub intervals. The problem is that some intervals ended in less than 2 hours but some others ended in 12 hours... How could i improve the run time? Precision, for each sub interval i use 1 CPU and 5Go of memory.

Thanks in advance :-)

The command used:

    for i in {0000..0049}
do
    srun --ntasks=1 gatk --java-options "-Xmx${SLURM_MEM_PER_CPU}M" HaplotypeCaller \
-R ${REF_Genome} \
-L ${Interval_DIR}/${i}.scattered.interval_list \
-I ${BAM_INPUT_DIR}/${BAM_INPUT} \
-O ${GVCF_OUTPUT_DIR}/${GVCF_OUTPUT}.${i} \
-G StandardAnnotation -G AS_StandardAnnotation -G StandardHCAnnotation \
-GQB 10 -GQB 20 -GQB 30 -GQB 40 -GQB 50 -GQB 60 -GQB 70 -GQB 80 -GQB 90 \
-ERC GVCF \
--pcr-indel-model NONE \
--tmp-dir ${TMP_DIR} &
done

wait
WGS Haplotypecaller GATK • 946 views
ADD COMMENT

Login before adding your answer.

Traffic: 2305 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6