Entering edit mode
2.5 years ago
oğuzhan
•
0
I am trying to align two paired-reads each are 20 gb in fastq. gz format, with 2 cpu and 16 gb ram. It has taken 16 hours for now and I think ıt will not finish today. Do you have any suggestion to speed up the bwa mem alingment?
are they 20gb in zipped size? If so that are gigantic files and yes that will take some time indeed, especially given the resources you assigned to the process.
The author of BWA-MEM recently published chromap [REF] which is much faster than BWA-MEM with similar accuracy. You may want to consider trying it depending on your data type.
context: " an ultrafast method for aligning and preprocessing high throughput chromatin profiles." (copy/pasted) and "is over 10 times faster than traditional workflows on bulk ChIP-seq/Hi-C profiles and than 10x Genomics’ CellRanger v2.0.0 pipeline on single-cell ATAC-seq profiles." (copy/pasted). Chromap is good for chromatin studies, not a general purpose faster aligner (compared to BWA-mem and Bowtie2) , as I understand from the manuscript.
Since the output bam file keeps expanding (it is now more than 200gb), the process is possibly continuing without any crash. Thanks for the chromap suggestion. But I deduced from the manuel of choromap that it is not suitable for wgs (my sequences are wgs derived). Therefore it seems the only way is to wait for me. I hope it will not take whole week.
If the BAM file is growing, that's a sure sign your process is humming along and simply taking time. Next time, if you're curious, you might try timing the alignment on the first million lines of your fastq files to get an estimate on how long it might take to complete given your hardware etc.