Question

Quantseq 3' seq alignment with kallisto

0

Entering edit mode

2.1 years ago

MANUEL • 0

Hi All,

In my current work, I am using Kallisto in 3 prime RNA seq data (read length 55bp) from melanoma samples. I am ideally expecting quantification mapped to only one transcript of my gene (canonical) as Quantseq 3' seq quantifies one read per transcript. Unfortunately, I can't see the canonical transcript (longest length here) in quantified output where I could see mapping to other few transcripts which is not the ideal case. Could anyone please explain if Kallisto can be used in my case, and if so why I am not getting abundance for canonical transcript (longest length)?

Quantseq kallisto • 1.9k views

ADD COMMENT • link updated 2.1 years ago by swbarnes2 14k • written 2.1 years ago by MANUEL • 0

0

Entering edit mode

Why would you make such an assumption? Why would you want to?

ADD REPLY • link 2.1 years ago by swbarnes2 14k

0

Entering edit mode

As I mentioned, I assume that the longest isoform (here canonical transcript) should contain the sequences that cover the alternative transcripts. So while Kallisto quantification I should ideally see abundance to my longest transcript as it also contains the sequences of alternative transcripts. The reason why I want only canonical transcripts is to quantify the abundance at the gene level and not at the transcript level.

ADD REPLY • link 2.1 years ago by MANUEL • 0

0

Entering edit mode

I don't think that's a fair assumption at all.

ADD REPLY • link 2.1 years ago by swbarnes2 14k

0

Entering edit mode

I see, Can you explain the reason why it is not fair to take the canonical transcripts?

ADD REPLY • link 2.1 years ago by MANUEL • 0

1

Entering edit mode

You should read how kallisto assigns reads. You should also realize that the longest transcript isoform doesn't necessarily include all other exons in a given gene.

ADD REPLY • link 2.1 years ago by benformatics 4.0k

score 0 · Answer 1 · 2022-10-19

0

Entering edit mode

2.1 years ago

benformatics 4.0k

How do you know the longest isoform is even expressed in your cells? IIRC any gene with a poly-A tail should be captured with Quant-seq just your reads are limited to the 3' end.

So you could have multiple isoforms expressed and you would capture them all. You also run the risk of your reads being pseudo-aligned to internal exons if they happen to coincide with short alternative isoform (if you are using the full transcripts).

You should probably create a 3'-based index (or at least that's what i would do if there is no special options for 3'-biased reads).

See index section:

https://pachterlab.github.io/kallisto/manual

ADD COMMENT • link 2.1 years ago by benformatics 4.0k

0

Entering edit mode

Thank you for the reply.

However, my doubt is whether my longest isoform needs to express in the cell if my longest isoform has sequences that cover the alternative transcripts. The internal exons should also contain poly A tails enough to sequence them too.

Did you mean creating a 3'-based index by taking only the 3' utr sequences? If so I may miss reads aligned to the exons as Quantseq starts sequencing from close to the 3' region and this could include exons even before the 3' utr regions.

ADD REPLY • link 2.1 years ago by MANUEL • 0

score 0 · Answer 2 · 2022-10-21

0

Entering edit mode

2.1 years ago

swbarnes2 14k

With a 3' sequencing system, I don't think you should be correcting for transcript length anyway. Every transcript, no matter how long has only one 3' end.

ADD COMMENT • link 2.1 years ago by swbarnes2 14k