Fitting distribution in R
0
0
Entering edit mode
5.4 years ago
ma23 ▴ 40

Hi everyone!

I have some data that describe gene expression of several people.

I want to understand whether the distribution of the data can be modeled as the Poisson or the Negative binomial distribution.

For the Poisson I use the next commands:

n <- length(x)
lambda = mean(x) # I use the MLE for the Poisson parameter
f.hyp = dpois(x,lambda)*n
chiSquare.pois = sum((f.obs-f.hyp)^2/f.hyp)

Am I right with this code ?

How can I estimate the parameters for the neg.binomial distribution and compare these two models (poisson and neg.binomial ) ?

R RNA-Seq gene expression Negative binomial • 2.1k views
ADD COMMENT
1
Entering edit mode

I would probably start with the papers and source codes of the established tools that model RNA-seq as NB, such as DESeq2 and edgeR to get an impression on how/why they do it.

ADD REPLY
0
Entering edit mode

As pointed out, check previous work to understand why people decided for a particular distribution for a given data type. Typically when trying to decide which distribution best approximates the data, visual tools (e.g. density and QQ plots) and goodness-of-fit tests are used (e.g. chi-squared test). For choosing between (families of) distributions, have a look at the R package fitdistrplus and its descdist() function.

ADD REPLY

Login before adding your answer.

Traffic: 1795 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6