RNA-Sequencing on RStudio vs batch scripts on Terminal
1
0
Entering edit mode
2.7 years ago
j_eag ▴ 10

Hey there

Beginner to all of this so apologies if I'm slow with the terminologies. I'm teaching myself RNA-seq and so far the online lessons have been doing everything ** on macOS Terminal with batch scripts. While I'm fine with using terminal, I was hoping I could do all of this on Rstudio (or even python), within ONE script/file. Of course, the script would contain several functions (one for downloading one for trimming etc etc), but since I can't find any resources on it or sample R code I'm worried I'm thinking about this all wrong.

What do you guys think? Is it silly to do everything on R? Is there no way I can connect RStudio to an hpc ?

**downloading files, quality check, trimming, indexing, alignment - haven't gotten to visualization and counting yet

fastq Rstudio RNA-seq • 948 views
ADD COMMENT
1
Entering edit mode

have a look at rstudio-server.

ADD REPLY
1
Entering edit mode

HPC and interactive analysis are usually not really compatible as you first need so submit some jobs to book a node, and the node then would need to run rstudio which again needs to be available as a module (often not the case). Installing it manually is a pain without admin rights, not sure if even possible as it has a ton of dependencies and afaik needs to copy some stuff to /var etc. That all is cumbersome. Is the job really so demanding that you need a HPC as the backend, or can you do it on a local machine? The entire trimming, alignment thing is not done in R usually, one uses pipelines that are triggered via the Unix command line. In R there would be Rsubread which wraps the subread aligner and featureCounts for count table creation, but this takes time to run so it should be submitted as a batch job rather than having a RStudio window open for it. Towards RStudio on HPC, I would contact the HPC admin on whether they have rstudip-server in place. You can run it also via Singularity (a container engine) but this is a bit more advanced that you probably are at that point. Interactive analysis is usually done locally. So get the job for alignment etc done on the HPC, and then download the count table and analyse locally. RNA-seq analysis like DEGs and clustering is not hardware-intensive unless you have thousands of samples. Hope this helps a bit.

ADD REPLY
0
Entering edit mode

I typical use Rmarkdown for running my R analyses and when necessary I use system to run bash scripts for certain steps in a pipeline. Not sayings it's the one and only way to integrate the two but it works for me.

ADD REPLY
1
Entering edit mode
2.7 years ago
seidel 11k

Is it silly to do everything on R?

Processing of RNA-Seq data typically consists of several steps involving several different programs, and some of the steps can take hours to days. Bash scripts are a good way to manage running several different programs and piping things together in various steps. While you could use R as your scripting language, why would you?

On the other hand, RStudio is meant to function as an interactive interface where you basically have a conversation with your data. With RStudio, people are typically present, actively working on their data. Question, manipulate, visualize, repeat. This is a natural way to process RNA-Seq data after counts have been tallied on genes and you want to begin exploring it and applying stats. If you have a standard thing you do with R to process the data, you don't need RStudio, you can simply run R like any script, and process your data non-interactively.

In summary, given the initial steps in RNA-Seq processing as computation and non-interactive time heavy, and the later steps as interaction oriented, it doesn't make sense to me to use RStudio for the first part. I would perform the first parts with bash or python (given their strength as scripting languages) to go from reads to BAM files or a count table, and focus on the second part with R or RStudio, depending on how you want to process your data.

ADD COMMENT

Login before adding your answer.

Traffic: 2366 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6