Hello all:),
I am trying to start with the ChIP-Seq as described in the Biostars Handbook. I find the book super well and pragmatically written, yet I encounter problems from time to time.
Today I run into problem of the WSL2 Ubuntu instance fastly outgrowing the space on the disc available.
In particular:
- I made a list of files for download "sra_list.txt"
- downloaded the YAP1 and IgG files for ChiP-Seq using this command:
cat sra_list.txt | parallel -j 50 'prefetch {}'
- Each sra file is in its own subfolder
- Thus I try to convert them into fastq files using this command:
cat sra_list.txt | parallel -j 4 'fasterq-dump {}/{}.sra --split-files --threads 6
However there I run out of space on this more powerfull computer we have.
What is the correct way to restrict the wsl instance on windows? I guess many people must have encountered this problem(?)
Or is it because I am too greedy and by running in parallel I make the problem worse and one by one would be better?
Thank you very much for your help
Best Regards Vladimir Vinarsky The Curious Mechanobiologist
How many cores, how much RAM and disk space do you have on this computer? WSL2 allocates a TB of disk space so as long as you have the resources you should not be running out of "space". See --> https://learn.microsoft.com/en-us/windows/wsl/disk-space
What is the exact error you are getting? Even if you start more processes you probably have one GB ethernet (hopefully you are not trying to download over WiFi) and that is likely getting saturated.
Thank you,
I have 32GB rRAM, 16 Cores, Threads 32, and some 120GB on the C:\ drive (which is the SSD).
I think the download and processing is okey, the only problem is the limitation of the SSD storage which is around 120G.
I also tried since to mount either the folders alone, or whole WSL2 to a HDDs, which worked well for the download, but the following steps (fastqc, alignment etc.) get extremely slow on HDD (like 1min30 vs 30+min) so I am sticking with SSD and doing it "per partes."
It got a bit confusing by the WSL2 storage having the strange property of just expanding, but not being able to vacate the space after files get deleted.
But now I understand it is explained by the the initial allotment of 1TB, so I am okey with it.
I am keeping an eye on the storage space and then compressing the WSL2 virtual drive i placed in
C:\WSL\Ubuntu_01
using this handy commands in powershell as administratorThanks Vladimir
It depends on how much data you are trying to download and the storage space left on your PC's
C:\
drive, as it's the default location forWSL
to store files. You can change the download location to a more appropriate drive by browsing the/mnt
directory.I think there is a certain requirement for temporary disk space during prefetch, and you are likely blowing through it by running this in parallel. My guess is you either won't have a problem by running the commands individually, or it will happen later in the process.