Entering edit mode
4.8 years ago
rleach
▴
180
I've seen R and python code for doing this conversion, but (for setting up a pipeline) is there an established command-line utility and galaxy tool for converting the split-seq output (sparse matrix market format) into a format that can be supplied to the seurat galaxy wrapper (which appears to take a gene by cell tsv)?
I wrote a quick perl one-liner to do the conversion myself, but when I supplied it to seurat, it complained about duplicates. What constitutes a duplicate? The genes and cell IDs are all unique. Perhaps I got some detail of the format wrong?
Seurat supports
mtx
files. I'm not sure how Galaxy has it set up, but it's valid input.Yes, I saw that it has a read10x method that seems like it would work. Unfortunately, the current galaxy wrapper doesn't have that incorporated. I just added a comment on a galaxy issue to add that as a supported input type. In the meantime, I'm still hoping to find either a command line utility or a galaxy wrapper to accomplish this.
Basically, I want to either write a shell script that is accompanied by a conda environment with the necessary tools that users can install in their accounts on our cluster and run or else provide a galaxy workflow that does the same thing. I could include an R script I suppose, but that just doesn't feel as clean as I'd like it to be...