Dear all,
Has anyone worked with BD Rhapsody single cell data before?
I'm sure there are other alternatives with Dropseqpipe and others, but has anyone found a decent solution for these data? Their library design does not seem that simple either with variable bases in the V2 scheme. https://teichlab.github.io/scg_lib_structs/methods_html/BD_Rhapsody.html
Thanks
Edit: so local usage is possible, as LChart says below.
You basically have to install common workflow language CWL using their PDF instruction guide, then download their repository of just yml and cwl files from here https://bitbucket.org/CRSwDev/cwl/src/master/. Then you edit the yaml files to point to your input files (should be easy).
The next step is running with eg
cwl-runner --outdir out1 rhapsody_wta_1.9.1.cwl template_wta_1.9.1.yml
and then experience cryptic CWL errors. I haven't been successful yet, and am not sure if the problem is with the CWL from BD Rhapsody, or the yml syntax from me (which appears valid).
Edit 2- I was successful with the cwl 2.0 version some time ago. Previous versions did not work for me.
Edit 3 Using an own/custom reference
Coming back to this in 2024 with new data, make sure you don't overlook the file make_rhap_reference_2.0.cwl
in the bitbucket repo subfolder. If you have a GTF which is failing since all the transcripts and exons do not have the correct biotype, then turn GTF filtering off.
cwl-runner make_rhap_reference_2.0.cwl --Genome_fasta chromosomes.fasta --Gtf some.gtf --Filtering_off
Edit 4. TMPDIR
I had problems with TMPDIR since it was on a partition which was not mountable by docker, and got a permission denied when the pipeline was extracting the refererence.tar.gz, even though the fasta and gtf were now ok. Setting TMPDIR to a large local directory (non-nfs etc) actually works.
TMPDIR set up on some large partition since the pipeline tends to fill in /tmp
export TMPDIR=/path/to/your/large/partition
Hello!
Thank you for this great solution. I used the same approach to run the BD 2.2 version (multiome). In my case, there was no need to extract the PATH to node_alpine, I just loaded the module from the cluster. In case it is useful I pulled the image using singularity.
singularity pull docker://bdgenomics/rhapsody:2.2
Also I use a conda env for the cwltools.
Cheers
Glad it did work for you with the updated version of the Rhapsody ;).
It would be great if at some point we can get it working in parallel using
toil
.