Hi All,
i have a paired-end bulk RNAseq generated with UMIs in order to reduce duplicates from PCR Since now, i used my own piepeline with STAR + UMI_tools to deal with the UMIs and generate a "clean duplicates " bam file, but I wnat to know if kallisto is able to deal with this data I have three fastq's : left and right paired-end FASTQs and one FASTQ for the UMIs.
I used kallisto in pseudobam mode, first generating my batch file
#id umi file1 file2
sample UMI_001.fastq.gz B_L001_R1_001.fastq.gz B_R2_001.fastq.gz
And then running kallisto in this way
kallisto pseudo --index=/home/Genomes/Transcriptome_g1k_v37_kallisto_index -o kallisto_output -b batch.txt --umi -t 20 2>&1 | tee output_log.txt
However, it seems that Kallisto with --umi option is only capable to deal with single-end
Am I right or maybe I forgot anything? Any ideas will be highly appreciated
Thanks in advance
When using
kallisto pseudo --umi
, kallisto considers the first file as UMIs:Exactly, that's what I read. And is the the same that I put on batch.txt as you can see in my post, the point then (what i was asking) is if Kallisto is only able to deal with UMIs in single end mode (i.e first file UMI and second one FASTQ).
kallisto bus
can deal with several UMI schemes, but I don't know if it can deal with your particular case - I never usedkallisto bus
.