Question

Kallisto alignment method for single nuclei

2

Entering edit mode

4.0 years ago

Shawn ▴ 20

Could someone help further clarify how kaliisto deals with alignment for intronic regions?

I have come across the following and am a bit unsure what is correct

"The pseudoalignment-to-transcriptome algorithms force intronic reads to map to spurious genes, resulting in hundreds false positive genes in each cell. "

"What you experience is an outcome of the way that kmer-based pseudoalignment works. A read is k-compatible with a target if all of the mappable k-mers from a read occur in that target. When you add the intergenic sequences then there might be k-mers that were not originally mappable, but now become mappable to the new intergenic sequences"

"In order to know which reads come from spliced as opposed to unspliced transcripts, we need to see whether the reads contain intronic sequences. Thus we need to include intronic sequences in the kallisto index"

I am trying to decide between starsolo and kallisto. In part, I think I am going to run both simultaneously and see what the data looks like. However, I am dealing in single-nuc and want to make sure I have a clear understanding of how intronic reads are dealt with.

Additionally, any additional info on what kinds of transcripts one would expect from nuclei vs say the endoplasmic reticulum and how they vary in antisense reads/lncrna, as well as other variation and just an overview of what can/can't be detected from just single nuc vs single cell samples. Would really appreciate some added info on the significance of this variation and any additional info that may be helpful

Kallisto • 1.5k views

ADD COMMENT • link updated 4.0 years ago by dsull ★ 7.6k • written 4.0 years ago by Shawn ▴ 20

1

Entering edit mode

Yes, these quotes are from the recent STARsolo preprint. Why don't you simply use either STARsolo itself, or Alevin using a combined exonic- and intronic index and the full genome decoy? That should much better take care of these spurious mappings.

ADD REPLY • link 4.0 years ago by ATpoint 88k

score 2 · Answer 1 · 2021-06-09

With a usual transcriptome index, kallisto doesn't pseudoalign to introns. If you have intronic reads from unspliced transcripts, yes, some of those reads may erroneously pseudoalign to an incorrect transcript in the transcriptome index.

To account for this, you may want to add introns to your kallisto index. A great and easy tutorial (specifically designed for your type of data: single nucleus RNAseq) on how to do so is here: https://www.kallistobus.tools/tutorials/kb_nucleus/python/kb_single_nucleus/