Entering edit mode
24 months ago
hazirliver
▴
10
Hi!
I'm trying to convert .SRA
files to .fastq
with fasterq-dump
tool but on each file have same error fasterq-dump was killed (signal 9 SIGKILL)
For example, the processing logs for files SRX2481673/SRR5164647.sra
and SRX2481676/SRR5164650.sra
are shown below. But I get exactly the same error on other files.
INFO - SRX2481673, SRX2481676 will be processed
INFO - Processing SRX2481673/SRR5164647.sra
INFO - cursor-cache : 1,073,741,824 bytes
INFO - buf-size : 1,073,741,824 bytes
INFO - mem-limit : 13,958,643,712 bytes
INFO - threads : 3
INFO - scratch-path : '/home/asokolov/osrp/tmp/tmp_files/fasterq.tmp.fasterq-dump-instance-1-1544c93d1a6045778cd95824bb59b763.74/'
INFO - total ram : 810,184,228,864 bytes
INFO - output-format: FASTQ split file
INFO - check-mode : on
INFO - output-file : '/home/asokolov/osrp/tmp/FASTQs/SRX2481673/SRR5164647.fastq'
INFO - output-dir : '/home/asokolov/osrp/tmp/FASTQs/SRX2481673/'
INFO - output : '/home/asokolov/osrp/tmp/FASTQs/SRX2481673/SRR5164647.fastq'
INFO - append-mode : 'NO'
INFO - stdout-mode : 'NO'
INFO - seq-defline : '@$ac.$si $sn length=$rl'
INFO - qual-defline : '+$ac.$si $sn length=$rl'
INFO - only-unaligned : 'NO'
INFO - only-aligned : 'NO'
INFO - accession : 'SRR5164647'
INFO - accession-path: '/home/asokolov/osrp/tmp/SRAs/SRX2481673/SRR5164647.sra'
INFO - est. output : 21,238,500,400 bytes
INFO - disk-limit-tmp input : 128,849,018,880 bytes
INFO - disk-limit (OS) : 1,462,599,024,640 bytes
INFO - disk-limit-tmp (OS) : 1,462,599,024,640 bytes
INFO - out/tmp on same fs : 'NO'
INFO -
INFO - SRR5164647 is local
INFO - ... has a size of 3,611,444,133 bytes
INFO - ... is cSRA with alignments
INFO - ... SEQ has NAME column = YES
INFO - ... SEQ has SPOT_GROUP column = YES
INFO - ... uses 'SEQUENCE' as sequence-table
INFO - SEQ.first_row = 1
INFO - SEQ.row_count = 39,822,941
INFO - SEQ.spot_count = 39,822,941
INFO - SEQ.total_base_count = 7,831,644,330
INFO - SEQ.bio_base_count = 7,831,644,330
INFO - SEQ.avg_name_len = 1
INFO - SEQ.avg_spot_group_len = 7
INFO - SEQ.avg_bio_reads_per_spot = 2
INFO - SEQ.avg_tech_reads_per_spot = 0
INFO - ALIGN.first_row = 1
INFO - ALIGN.row_count = 77,821,594
INFO - ALIGN.spot_count = 77,821,594
INFO - ALIGN.total_base_count = 7,651,115,820
INFO - ALIGN.bio_base_count = 7,651,115,820
INFO -
INFO - disk-limit(s) not exeeded!
INFO - fasterq-dump was killed (signal 9 SIGKILL)
INFO - Processing SRX2481676/SRR5164650.sra
INFO - cursor-cache : 1,073,741,824 bytes
INFO - buf-size : 1,073,741,824 bytes
INFO - mem-limit : 13,958,643,712 bytes
INFO - threads : 3
INFO - scratch-path : '/home/asokolov/osrp/tmp/tmp_files/fasterq.tmp.fasterq-dump-instance-1-1544c93d1a6045778cd95824bb59b763.130/'
INFO - total ram : 810,184,228,864 bytes
INFO - output-format: FASTQ split file
INFO - check-mode : on
INFO - output-file : '/home/asokolov/osrp/tmp/FASTQs/SRX2481676/SRR5164650.fastq'
INFO - output-dir : '/home/asokolov/osrp/tmp/FASTQs/SRX2481676/'
INFO - output : '/home/asokolov/osrp/tmp/FASTQs/SRX2481676/SRR5164650.fastq'
INFO - append-mode : 'NO'
INFO - stdout-mode : 'NO'
INFO - seq-defline : '@$ac.$si $sn length=$rl'
INFO - qual-defline : '+$ac.$si $sn length=$rl'
INFO - only-unaligned : 'NO'
INFO - only-aligned : 'NO'
INFO - accession : 'SRR5164650'
INFO - accession-path: '/home/asokolov/osrp/tmp/SRAs/SRX2481676/SRR5164650.sra'
INFO - est. output : 16,726,708,244 bytes
INFO - disk-limit-tmp input : 128,849,018,880 bytes
INFO - disk-limit (OS) : 1,462,598,959,104 bytes
INFO - disk-limit-tmp (OS) : 1,462,598,959,104 bytes
INFO - out/tmp on same fs : 'NO'
INFO -
INFO - SRR5164650 is local
INFO - ... has a size of 2,857,778,130 bytes
INFO - ... is cSRA with alignments
INFO - ... SEQ has NAME column = YES
INFO - ... SEQ has SPOT_GROUP column = YES
INFO - ... uses 'SEQUENCE' as sequence-table
INFO - SEQ.first_row = 1
INFO - SEQ.row_count = 31,376,520
INFO - SEQ.spot_count = 31,376,520
INFO - SEQ.total_base_count = 6,166,997,722
INFO - SEQ.bio_base_count = 6,166,997,722
INFO - SEQ.avg_name_len = 1
INFO - SEQ.avg_spot_group_len = 7
INFO - SEQ.avg_bio_reads_per_spot = 2
INFO - SEQ.avg_tech_reads_per_spot = 0
INFO - ALIGN.first_row = 1
INFO - ALIGN.row_count = 61,510,763
INFO - ALIGN.spot_count = 61,510,763
INFO - ALIGN.total_base_count = 6,043,540,736
INFO - ALIGN.bio_base_count = 6,043,540,736
INFO -
INFO - disk-limit(s) not exeeded!
INFO - fasterq-dump was killed (signal 9 SIGKILL)
I run fasterq-dump
with following arguments:
bufsize='1G',
curcache='1G',
mem='13G',
threads=3,
disk-limit-tmp = '120G'
What could be the problem and how can I fix it?
Save yourself the trouble and download the fastq's using:
This script was generated by
https://sra-explorer.info
. Information on how to use: sra-explorer : find SRA and FastQ download URLs in a couple of clicksI don't think this is a good solution, because I pass the list of SRXs that I need to process dynamically in the script
You can use ffq - https://github.com/pachterlab/ffq - to dynamically obtain the download URLs from SRXs.
Thank you! I think this is the best solution for my task
Wanted to mention it in case you were able to use it.
sigkill 9
indicates immediate process termination. Are you exceeding any other resource allocations for your account (e.g. RAM) since the message above says that disk-limits were not exceeded.I don't think the RAM limits were exceeded. This task is run on a separate machine with lots of RAM Memory limits for fasterq-dump are set to 13GB. There are no other tasks running on the machine and there is a lot of free memory left.
the docs for fasterq dump say it can use up to three times as much RAM as claimed
in general both fastq-dump and fasterq-dump are badly written programs, commonly causing all manner of weird errors and problems
Hi, If I've understood correctly the log then the scratch space (tmp) is in your home dir. There is a parameter (I think
-t
) that could be used to explicitly specify different location of scratch. Have you tried that already?