Kallisto and downstream analysis with tximport and DESeq2
1
0
Entering edit mode
5.7 years ago
Mozart ▴ 330

Hello there, I am trying to analyse a dataset using kallisto and its abundances generated. thus, I am using tximport and want, then, use TPM counts and when I open the txi.kallisto.tsv files, essentially I have three different columns (i.e. 'abundances', 'counts' and 'length'). I am not sure whether the counts tximport pretend to import are normalized or not (i.e. TPM counts or not)??

thanks in advance

RNA-Seq kallisto tximport deseq2 • 8.9k views
ADD COMMENT
1
Entering edit mode

As an aside, you should not use normalized counts with DESeq2. It expects unnormalized, raw counts.

ADD REPLY
0
Entering edit mode

Thanks! So, when I am using tximport, the function DESeqDataSetFromTximport function automatically correct for the length bias? I guess so; in fact, from here link to tximport:

Note: there are two suggested ways of importing estimates for use with differential gene expression (DGE) methods. The first method, which we show below for edgeR and for DESeq2, is to use the gene-level estimated counts from the quantification tools, and additionally to use the transcript-level abundance estimates to calculate a gene-level offset that corrects for changes to the average transcript length across samples. The code examples below accomplish these steps for you, keeping track of appropriate matrices and calculating these offsets. For edgeR you need to assign a matrix to y$offset, but the function DESeqDataSetFromTximport takes care of creation of the offset for you. Let’s call this method “original counts and offset”.

but if someone could confirm this, that would be great.

ADD REPLY
2
Entering edit mode
5.7 years ago
ATpoint 86k

For use with DESeq2 just follow the timport manual section for kallisto, but set txOut=F to aggregate transcript abundances to the gene level. The countsFromAbundance="scaledTPM" function from what I understand is only necessary to output a count matrix in case you want to use it for something else rather than DESeq2, so not necessary in this case.

DESeqDataSetFromTximport function automatically correct for the length bias

Yes that is the whole point of this method. It is the length bias due to different transcript/isoform usage between the condition that is of interest here, which will be corrected by passing an offset to DESeq2 for the linear model.

ADD COMMENT

Login before adding your answer.

Traffic: 1858 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6