GSVA kernels: Gaussian or Poisson?
1
2
Entering edit mode
4.4 years ago
psm ▴ 130

Hi all, this question has indirectly come up several times. What is the best kcdf setting to use for GSVA analysis on non-log or non-variance normalized TPM data?

For GSVA analysis using RNAseq data, the GSVA manual states:

"We calculate now GSVA enrichment scores for these gene sets using first the microarray data and then the RNA-seq integer count data. Note that the only requirement to do the latter is to set the argument kcdf="Poisson" which is "Gaussian" by default.Note, however, that if our RNA-seq derived expression levels would be continous, such as log-CPMs, log-RPKMs or log-TPMs, the the default value of the kcdf argument should remain unchanged.

I assume that non-variance normalized TPM data should be treated by using the "Poisson" argument. However, following length normalization, most TPM data ends up as non-integer. I realize that this is the result of a linear transformation so the underlying structure of the data is unchanged, but according to the manual, it appears to be implied that the Gaussian setting may be appropriate for non-integer data, which includes non-variance normalized TPM.

I clearly don't understand the nuances of this setting, but wondering what other people's thoughts/suggestions/explanations are on this topic. For now, I'm just performing log1p on my TPM data and using the Gaussian argument, which runs much faster.

RNA-Seq • 2.6k views
ADD COMMENT
1
Entering edit mode
4.4 years ago

It really does just depend on the distribution of the input data, which you already appear to understand. The default of Gaussian is set thus due to the fact that most downstream datasets that are using GSVA will have already been normalised and transformed to a Gaussian. If I were using FPKM, RPKM, or just 'normalised' RNA-seq counts, on the other hand, I would use Poisson.

So, check via histogram and other summary metrics to verify the distribution on which your data is measured.

Kevin

ADD COMMENT
1
Entering edit mode

Thank you Kevin - appreciate the response.

ADD REPLY

Login before adding your answer.

Traffic: 2128 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6