Entering edit mode
4.5 years ago
mahejabeen.nidhi
▴
20
$ conda install -c bioconda fastqc
Collecting package metadata (current_repodata.json): failed
CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/bioconda/linux-64/current_repodata.json>
Elapsed: -
An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
ConnectionError(MaxRetryError("HTTPSConnectionPool(host='conda.anaconda.org', port=443): Max retries exceeded with url: /bioconda/linux-64/current_repodata.json (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object="" at="" 0x7fe37c3ff050="">: Failed to establish a new connection: [Errno 101] Network is unreachable'))"))
I am using the university's HPC system (details of it below) for RNASeq. However, as you can see above, I cannot download packages due to CondaHTTPError. How can I resolve this?
$ conda info
active environment : None
user config file : /home/mhnidhi2/.condarc
populated config files : /home/mhnidhi2/.condarc
conda version : 4.7.12
conda-build version : 3.18.9
python version : 3.7.4.final.0
virtual packages :
base environment : /opt/ohpc/pub/anaconda3 (read only)
channel URLs : https://conda.anaconda.org/bioconda/linux-64
https://conda.anaconda.org/bioconda/noarch
https://conda.anaconda.org/conda-forge/linux-64
https://conda.anaconda.org/conda-forge/noarch
https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
package cache : /opt/ohpc/pub/anaconda3/pkgs
/home/mhnidhi2/.conda/pkgs
envs directories : /home/mhnidhi2/.conda/envs
/opt/ohpc/pub/anaconda3/envs
platform : linux-64
user-agent : conda/4.7.12 requests/2.22.0 CPython/3.7.4 Linux/3.10.0-957.el7.x86_64 centos/7.6.1810 glibc/2.17
UID:GID : 1311:1001
netrc file : None
offline mode : False
You should talk to your cluster admin about the connection error.
Another way to resolve these conda-related connection issues (if it isn't just an intermittent issue) is to build the environment in a Singularity container in your local machine (laptop/desktop) where you have control over the connection. Then move that container to HPC.
That is an amazing idea! I think I will try to do that. I am new to working with HPCC. Do you have a git repo on building containers like that or other HPCC functions? Thank you again!
Do your self a favor and solve the underlying problem with your cluster admin. If you are going back and forth between a local machine and a HPC you have to push a new container to the HPC everytime you want to install a new package, this must be laborious.
Very fair point. I already emailed the admin. Hopefully he can resolve this issue. Thank you!
If your cluster does not have direct/external internet access then many things will not work. Perhaps you need to use a proxy. Again this is local info you would need to find out.
Resolving with admin is best course, my suggestion was an alternate in case that didn't work.
Re: labouriousness, I think it depends on your use case. If you have a pipeline you've already developed and know the required packages then containers are equally time-economical. For development a conda env is ideal, and that can be packaged inside a container once 'complete'. Benefits of that containerised version is in production/certification setting, and also specifically using NextFlow which takes advantage of HPC IME.
I haven't got any guides on how to containerise conda envs, I'll post one if you like?