Kallisto: to scaledTPM or not to scaledTPM
1
0
Entering edit mode
4 months ago
IM • 0

Dear Community members,

I found several informative posts on the difference between scaled TPM, lengthScaledTPM, and no scaling but I am unsure of how the options apply for Kallisto considering that the output from Kallisto is in TPM. So far, I have been using scaled TPM for my analysis in EdgeR. The following is the way I import my counts into EdgeR:

data_<- tximport(files, type = "kallisto", tx2gene=tx2gene, txOut=F, ignoreAfterBar= T, countsFromAbundance="scaledTPM")

Any recommendations on the correct method (for differential analysis) and explanation would be greatly appreciated. Thank you

tximport kallisto EdgeR RNAseq • 1.2k views
ADD COMMENT
1
Entering edit mode
4 months ago
Gordon Smyth ★ 7.8k

Are you wanting to do a gene level or transcript level analysis in edgeR? To input transcript-level kallisto output into edgeR, simply use:

library(edgeR)
y <- catchKallisto(paths)

where paths is a character vector specifying the directories containing the kallisto output. See

Baldoni et al (2024). Dividing out quantification uncertainty allows efficient assessment of differential transcript expression with edgeR. Nucleic Acids Research 52(3), e13. https://doi.org/10.1093/nar/gkad1167

or Sections 2.18 and 4.6 of the edgeR User's Guide: https://doi.org/doi:10.18129/B9.bioc.edgeR

ADD COMMENT
0
Entering edit mode

Hello Gordon Smyth,

Thank you for your reply. I am doing a gene-level analysis in edgeR

ADD REPLY
0
Entering edit mode

I think complete instructions are in tximport vignette: https://bioconductor.org/packages/devel/bioc/vignettes/tximport/inst/doc/tximport.html

ADD REPLY
0
Entering edit mode

Hi professor,

Does that mean catchKallisto cannot be used for gene-level analyses?

ADD REPLY
1
Entering edit mode

I only intended catchKallisto for transcript-level analyses, although it is easy enough to convert it to gene-level by aggregating the expected counts over genes, and we did that in fact to obtain gene-level overdispersions for the Baldoni et al (2024) paper.

For gene level analyses, I personally prefer not to rely on transcript annotation at all because my own (unpublished) research shows that transcript quantification tools are highly sensitive to incomplete or inaccurate annotation. I have plans to publish something on this but it's not at the top of my queue yet.

ADD REPLY
0
Entering edit mode

That makes sense. Thanks!

ADD REPLY
0
Entering edit mode

Thank you for your reply, Gordon. A follow-up question: since lengthScaledTPM is calculated by multiplying TPM with feature length and scaled up by library size, the TMM normalization downstream is not required, right?

ADD REPLY
1
Entering edit mode

If you use tximport to read kallisto output into edgeR, then you must follow the instructions in the tximport vignette. The appropriate section of the vignette shows you explicitly how to combine edgeR TMM normalization factors with the tximport normalization.

ADD REPLY

Login before adding your answer.

Traffic: 3808 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6