Hi All,
I'm interested to quantify the propositions of spliced vs unspliced transcripts at transcript level. As pointed out this page(https://combine-lab.github.io/alevin-tutorial/2020/alevin-velocity/). But they used single-RNA sequencing data so, it is salmon alevin. But what i did was i followed till the indexing and used that for the salmon qunat. I'm using HPC with 200gb of RAM after 24 hrs till job was going (6 fastq files). I suspect there is some problem. so i'm wondering is it possible to run salmon alevin on total rna seq for the quantification?
Thanks
if i understood correctly, make annotation for spliced and unspliced separately and index the genome separately followed by salmon qunat. is it correct? Because previously indexed genome has both information together. it is taking unexpectedly longer time which i did not expect from salmon.
Make a transcript annotaiton file that contains both the spliced and the unspliced transcripts and the genome, with only the genome entries marked as decoys.
I'm afraid it probably will take longer, because there will be a lot more sequence included in the annotation (exons only make up a small % of the total length of transcripts).
ohh .. I think that's what exactly i did. but it is still running on my HPC.
First, you shouldn't hog 200 gb memory from the HPC. You only need less than one-tenth of that.
Second, increasing number of threads will improve runtime.
Third, it appears you're indexing the genome -- you should be indexing the targets (e.g. each unspliced transcript and each spliced transcript gets their own fasta entry).
I think the command i attached isn't clear. I used grl <- eisaR::getFeatureRanges (intron,spliced)
Then i used GRCh38_expanded.fa and actual GRCh38_primary_assembly.fa. This is what sudbery suggested right? or am i making mistake here. But you are suggesting to create spliced.fa and unpliced.fa then index them separately followed by quantification.
Yes, create them and put spliced.fa and unspliced.fa into one fasta file. You shouldn't be indexing genome.fa at all (you're "mapping" against "targets", not "aligning" to the "genome")
okay. I will do that. Thanks for the answer.