Entering edit mode
3.0 years ago
Dhanusha
•
0
I have tried RNA Velocity with alevin for my smart-seq2 dataset, However, the log files show Total 62.3886% of reads will be thrown away because of noisy Cellular barcodes. I can't use kallisto bustools also because the reads are not having UMI barcodes. Any suggestion on how to proceed with RNA velocity for smart-seq2 dataset?
That's the concern, since smart-seq2 doesn't have UMI barcodes I am not sure whether it is processing the reads correctly or not. So you are suggesting STARsolo can detect the spliced and unspliced mRNA from smart-seq2 data, I will try this. Thank you!
FYI, the devel version of kallisto | bustools can process smart-seq2 data (you'll need the devel version of kb-python and the devel version of kallisto). Happy to discuss this further.
I have tried kallisto|bustools for the smart-seq2, seems kb count is working. How will I proceed further? I have the following out files generated: adata.h5ad, bacth.txt, genes.txt, kb_info.json, matrix.abundance.mtx, matrix.cells, matrix.ec, matrix.fld.tsv, matrix.tcc.mtx, transcripts.txt, run_info.json
They have a tutorial, try that first: https://bustools.github.io/BUS_notebooks_R/velocity.html#preprocessing
By the way, the stable version of kallisto | bustools now includes smart-seq2 (no need to use the devel version).
In any case, it seems you you have it working for the standard analysis where introns aren't included in the reference. However, in order to produce an RNA velocity analysis, you must include introns. This means you must run both "kb ref" and "kb count" with the option: --workflow lamanno. Note: Including introns in the index will result in kallisto using a lot of memory so make sure you have lots of ram (~70 gb).
After that, you'll get both "spliced" and "unspliced" matrices. You get a loom file (be sure to specify --loom when running kb count) containing the two, which you can plug into a standard RNA velocity workflow.
It is giving an error message as
SMARTSEQ
can not be used with workflow lamanno. Any suggestions?Try installing the devel version of kb-python:
pip install git+https://github.com/pachterlab/kb-python@devel
And then try the lamanno workflow
I was having the same issue as Dhanusha and the devel version of kb-python works fine! thank you dsull
OK, you're right, that option is currently not available. I'll try to get that fixed when I can.
In the meantime, you can run the workflow without kb-python however it won't be that straightforward.
Here's what you need to do to get spliced/unspliced matrices:
Try running a 10XV3 RNA velocity analysis in kb-python using the --dry-run option. It'll show you a list of commands to get an RNA velocity analysis going.
Use those exact commands for your smart-seq data EXCEPT add -m and --cm whenever running bustools count and ignore the "bustools whitelist" and "bustools correct" commands.
E.g. For the step: "bustools correct -o ./tmp/output.s.c.bus -w ./whitelist.txt ./tmp/output.s.bus", you'll want to instead run "mv ./tmp/output.s.bus ./tmp/output.s.c.bus" (effectively skipping that step).
I'm sorry this isn't that straightforward but hopefully kb-python will be updated to fix this soon.