Running out of disc space with wsl2
0
0
Entering edit mode
7 weeks ago

Hello all:),
I am trying to start with the ChIP-Seq as described in the Biostars Handbook. I find the book super well and pragmatically written, yet I encounter problems from time to time.

Today I run into problem of the WSL2 Ubuntu instance fastly outgrowing the space on the disc available.

In particular:

  1. I made a list of files for download "sra_list.txt"
  2. downloaded the YAP1 and IgG files for ChiP-Seq using this command: cat sra_list.txt | parallel -j 50 'prefetch {}'
  3. Each sra file is in its own subfolder
  4. Thus I try to convert them into fastq files using this command:cat sra_list.txt | parallel -j 4 'fasterq-dump {}/{}.sra --split-files --threads 6

However there I run out of space on this more powerfull computer we have.

What is the correct way to restrict the wsl instance on windows? I guess many people must have encountered this problem(?)

Or is it because I am too greedy and by running in parallel I make the problem worse and one by one would be better?

Thank you very much for your help

Best Regards Vladimir Vinarsky The Curious Mechanobiologist

Memory WSL2 capacity Disc • 722 views
ADD COMMENT
1
Entering edit mode

How many cores, how much RAM and disk space do you have on this computer? WSL2 allocates a TB of disk space so as long as you have the resources you should not be running out of "space". See --> https://learn.microsoft.com/en-us/windows/wsl/disk-space

What is the exact error you are getting? Even if you start more processes you probably have one GB ethernet (hopefully you are not trying to download over WiFi) and that is likely getting saturated.

ADD REPLY
0
Entering edit mode

Thank you,
I have 32GB rRAM, 16 Cores, Threads 32, and some 120GB on the C:\ drive (which is the SSD).

I think the download and processing is okey, the only problem is the limitation of the SSD storage which is around 120G.

I also tried since to mount either the folders alone, or whole WSL2 to a HDDs, which worked well for the download, but the following steps (fastqc, alignment etc.) get extremely slow on HDD (like 1min30 vs 30+min) so I am sticking with SSD and doing it "per partes."

It got a bit confusing by the WSL2 storage having the strange property of just expanding, but not being able to vacate the space after files get deleted.
But now I understand it is explained by the the initial allotment of 1TB, so I am okey with it.
I am keeping an eye on the storage space and then compressing the WSL2 virtual drive i placed in C:\WSL\Ubuntu_01 using this handy commands in powershell as administrator

wsl --shutdown
Optimize-VHD -Path "C:\WSL\Ubuntu_01\ext4.vhdx" -Mode Full

Thanks Vladimir

ADD REPLY
0
Entering edit mode

It depends on how much data you are trying to download and the storage space left on your PC's C:\ drive, as it's the default location for WSL to store files. You can change the download location to a more appropriate drive by browsing the /mnt directory.

ADD REPLY
0
Entering edit mode

I think there is a certain requirement for temporary disk space during prefetch, and you are likely blowing through it by running this in parallel. My guess is you either won't have a problem by running the commands individually, or it will happen later in the process.

ADD REPLY

Login before adding your answer.

Traffic: 2894 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6