Hi,
I am interested in dedup the transcriptome output file (bulk RNA-seq) from STAR using umi_tools. I am using the following dedup
function from umi_tools. Here is the command:
umi_tools dedup --paired
--stdin=B1-Cond1_Aligned.toTranscriptome.out.bam --log=B1-Cond1_dedup.txt --umi-separator=":" --output-stats=B1-Cond1_ > .dedup.bam
However, I am getting the following error:
ValueError: fetch called on bamfile without index
I don't think we can index transcriptome file from STAR. I read the umi_tools document and didn't much options for bulk RNA-seq libraries. Do you know what is the best way to dedup using umi_tools? Thanks!
How about deduping based on the genomic bam, then filtering the transcriptome bam based on the reads leftover. Maybe some extra work, but might be faster if you can't figure out how to index/dedupe the transcriptome bam directly.
Yes, you are right. I plan to do this. Thanks!
Very helpful, thanks a lot!