Hi Guys, recently I have been dealing with a batch of silkworm(Bombyx mori) RNAseq data. An error arose which I cannot debug. Below is my workflow.
1.the genomic sequence of the silkworm (silkDB 3.0) is about 468.3Mb, 28 chromosomes.
2.The Linux server I am using has 288 cores and 1Tb memory size.
3.No problem arose when I created INDEX files with hisat2-build functionality.
4.An error always exits when hisat2 alignment. The following is an example. The memory size usage (%mem) continued to increase after the job submitted.
hisat2 -t -p 30 --dta -x /home/RNAseq_2/source/silkworm/index/silkworm_tran -1 /data/storage04/RNAseq_2/silkworm/majorbio/data4antivirus/cleandata/306D3D1a_R1-clean.fastq.gz -2 /data/storage04/RNAseq_2/silkworm/majorbio/data4antivirus/cleandata/306D3D1a_R2-clean.fastq.gz -S /data/storage04/RNAseq_2/silkworm/majorbio/data4antivirus/alignedFromHisat2Results/306D3D1a.sam
5.the size of the targeted sam file is expected to be 22Gb. But now, %mem is 55 when the sam file is just 7.8Gb.
6.I had tried to run similar 5 jobs with 8cores/job,resulting in the following error message:
(ERR): hisat2-align died with signal 9 (KILL)
I have googled a lot without any progress. Could you please figure out the issue and speed up the job?
Thanks in advance,
The node should be more than capable of dealing with this task based on the specs. Did you use a scheduler such as SLURM? If so please post the header lines of the submission script. Probably you did not allocate enough memory and the scheduler might have killed it.
Thanks for so fast comment. No scheduler was installed on the server. Therefore, I submit the job with nohup. Below is an example. nohup bash ${id}_hisat2.sh > ${outDir}/shell/logerr/${id}_hisat2.nohup-logerr 2>&1 &
Please run it on a single file with a plain bash command outside of that script and without any nohup, not sending it to background and without redirecting any streams. This will show where to start debugging. You can also run it just on a subset of the entire file for testing purposes.
I re-run it on a single file with a plain bash command without nohup at my laptop (8 cores, 16G memory size), resulting in the same error. Then I tried to check the original fastq files. The fastp tool for QC shows a big difference at the ADAPTER box between successful (~4Gb) and failed samples (~6Gb) in the alignment step.
INFO of a successful sample
Adapter or bad ligation of read1. The input has little adapter percentage (~0.247438%), probably it's trimmed before.
Sequence Occurrences 1. A 3051 2. G 2336 3. T 4015 4. other adapter sequences 206857
Adapter or bad ligation of read2. The input has little adapter percentage (~0.246621%), probably it's trimmed before.
Sequence Occurrences 1. A 3086 2. G 2277 3. T 4063 4. other adapter sequences 206838
INFO of a failed sample
Sequence Occurrences
Adapter or bad ligation of read2
Sequence Occurrences
What I can tell is that these two samples were derived from different batches, one of which was trimmed before my handling. But I still do not know how to debug it. Appreciate any advice.
Did you verify that the index you created was good? Is this install of HISAT2 known to otherwise work well? You have more than adequate hardware capacity (assuming nothing else is consuming that capacity when you are running these jobs) for this to work.
Yes, I have successfully run three samples from the same batch. PS: they have similar file sizes and pre-processed by fastp.
Do you get anything else printed after:
Sigkill 9 indicates that something is not right and the program needs to abort. If you have other samples that have worked well with HISAT2 on this machine then I would suggest that you investigate if your fastq files for this particular sample are corrupt. It may be best to re-process the originals and see if you have better luck with newly made files. Hope you are trimming paired-end data files together.
Good advice. I am checking the original fastq files. Maybe remote transportation from my laptop at USA to the Linux server at China is the reason.