I have transcript abundances from kallisto run with 100 bootstraps. My understanding is the bootstrapping gives information about the variability in the abundance estimate. If I use tximport to import this abundance information for use in deseq2, is the variance information from bootstrapping used by deseq in any way or does deseq calculate the variance in a different way?
I see in the tximport manual that there is a way to import the inferential replicate values by setting txOut=TRUE
and varReduce
to summarize the inferential replicates in to one variance value per transcript. But is this information used by DeSeq2 in anyway during the diff expression analysis?
Also, does RSEM perform any variance calculation for the estimated counts?
Background: I am trying to compare kallisto -> sleuth with featureCounts -> DeSeq2. kallisto followed by sleuth shows no significantly differentially expressed genes (at transcript or gene level) while featureCounts -> DeSeq2 shows several genes that are differentially expressed. To know if this is an effect of having the variance data, I wanted to try running the kallisto transcript abundances in Deseq2.
Thank you for replying. I saw that statement in the manual but that it meant that the information will not be used for gene-level analysis but will still apply if I was to look at Diff expr at the transcript level. What is the purpose of the
varReduce
argument when importing the data?I understand they are completely different methods of analysis. Based on other experimental data (qPCR, microarray), we know that there are differentially expressed genes (the mutant is of a transcription factor) and DEGs from deseq are consistent with what we would expect. It is also quite odd that PCA from the kallisto data showed poor separation of samples (particularly for one replicate), while PCA plot from featureCounts + DeSeq showed substantial separation of samples along one axis. I was wondering whether the bootstrapping was bringing out any underlying problems between the replicates. Kallisto-sleuth would be more convenient to use merely because of the speed of the analysis and I was trying to see if it is comparable to deseq.
If one method does not agree with your expectations which have confirmation by other methods then do not use it, right? There are alternatives such as salmon if you want a lightweight quantifier. Salmon offers several handy features such as GC and sequence bias correction plus is now able to use decoy sequences and selective alignment to improve accuracy.
Please see the updated answers.