So I am working on Hi-C Data Analysis using 4DN Pipeline. In that there is this step to mark the PCR Duplicates. When I run the program its works and get terminated at the end stating :-
/home/ubuntu/Pipeline/docker-4dn-hic/scripts/run-pairsam-markasdup.sh: line 35: 16998 Killed pairtools dedup --mark-dups --nproc-in ${inThreads} --nproc-out ${outThreads} --output-dups - --output-unmapped - --output ${MARKED_PAIRSAM} ${PAIRSAM}
I dont know what is the issue, I read somewhere that its the kernel that's killing the script and to check dmesg
file, but there is no information regarding the process in that file. So i dont know what is happening. I also read that the input file might be the issue but the input file was generated without any error.
I am running this on AWS :-
- Ubuntu
- 256gb Ram
- 64 Threads
Size of input File :- 35Gb
The problem is this script takes around 3hrs to process and after that the kernel kills it. When i first ran this command that time i checked the dmesg as suggested and it was Out Of Memory, so i changed the instance with larger memory after that i ran this script for like 7-8 times same error but got no information in the dmseg
Why do you think the kernel is killing the job? What is the exact error message? As I said above try reducing the number of threads. Memory usage should reduce if you go with less number of threads.
Yes, Thank you it Worked.