Forum:How people usually perform scRNA Seq Analysis on HPC?
3
0
Entering edit mode
16 days ago
Tundup • 0

I am beginner and want to know how people usually perform single cell RNA Seq Analysis on clusters - RStudio GUI on HPC or RScript execution on HPC terminal.

I heard that one way to do is - write script in a section, execute, check errors, execute and then finally compile all together to execute final one. Same can be applied for other samples or cells. Also, is this the way to develop a new pipeline?

Any comments would be appreciated.

hpc Single-cell cluster scRNA-Seq • 434 views
ADD COMMENT
1
Entering edit mode

If you have a HPC then you must have folks that administer that HPC. You can work with them to see if any of this is already available on your cluster (there may already be RStudio etc). You could then ask them to install the necessary packages (if they are not already present, you won't likely be the only person using scRNA at your institution). This may lead to a much more efficient/pleasant user experience.

ADD REPLY
5
Entering edit mode
16 days ago

You can send a job in your HPC to set up a Rstudio instance or Jupyterlab instance inside containers environments like Docker or Singularity. Once the instance is running you will get the URL of your instance in the log file of the job. Then you can make a tunneling (see below) between your HPC and your workstation and paste the instance URL inside your favorite web browser.

ssh -t -t name@hpc.dot -L port:localhost:port ssh node -L port:localhost:port

Everything you will do in this Rstudio instance will be processed by the HPC with the specification you asked in the job.

ADD COMMENT
1
Entering edit mode

This assumes that the HPC is setup to allow port-forwarding etc. Many HPC may be tightly controlled. The speed of internet connection/firewalls/VPN can further constrain the entire user experience of this method. Something to keep in mind.

ADD REPLY
0
Entering edit mode

That is why I prefer Singularity over Docker at the moment, sudo privileges are not required to run a singularity instance but docker does.

ADD REPLY
2
Entering edit mode
16 days ago
Dave Carlson ★ 2.1k

Your HPC cluster may provide an instance of Open OnDemand that allows you to access the cluster and run various applications like Rstudio directly from your browser.

Because every scRNA-seq experiment is so different, I find it hard to write a one-size-fits-all script or pipeline to do the analysis. Instead, I end up doing a lot of experimentation and iteration in Rstudio. If your cluster admins provides Open OnDemand for running Rstudio, then I would recommend using it there. If not, I would recommend asking them about it. It's not really a lot of work to setup, and it's extremely useful.

ADD COMMENT
1
Entering edit mode
16 days ago
gglim ▴ 220
  1. Set up an R Environment & install essential packages on HPC;
  2. Write scripts and do some tests in local RStudio (excluding mem-intensed parts);
  3. Run the whole pipeline on HPC, simply Rscript --vanilla my_script.R

BTW sometimes the first step can be the most difficult :)

ADD COMMENT
1
Entering edit mode

I would strongly recommend a containerized solution to make things software-wise reproducible and portable.

ADD REPLY

Login before adding your answer.

Traffic: 1798 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6