Are logr ratios supposed to be normally distributed?
3
0
Entering edit mode
8.2 years ago
novice ★ 1.1k

In a CNV analysis workflow, do you expect the logr ratios along the genome to be from a normal (or approximately normal) distribution? Why?

statistics CNV • 1.6k views
ADD COMMENT
2
Entering edit mode
8.2 years ago
Eric T. ★ 2.8k

The read depths are technically a mixture of Poisson or log-Poisson distributions (depending on whether PCR was done), I think, but usually modelled well enough with the inverse binomial distribution. In log scale it is a bit more overdispersed than Poisson, and the second parameter in the density function for the inverse binomial distribution can be used to account for that.

It may be "normal enough" for your needs, with the understanding that there will be more "outliers" or extreme values than you'd expect to see under the normal distribution. Robust statistics (e.g. MAD or Tukey's biweight midvariance instead of standard deviation) can be used to insulate your results from these extreme values if you'd like to use statistical methods that assume normality.

ADD COMMENT
0
Entering edit mode
8.2 years ago
ssv.bio ▴ 200

you can easily know by running normality tests like shapiro..

ADD COMMENT
0
Entering edit mode

I know. I also did a qqplot. Both methods show that my data considerably deviates from normality. My question was whether that is what you would expect for logr ratios (i.e. should I be concerned). Sorry if I failed to make that clear in the original question.

ADD REPLY
0
Entering edit mode
8.2 years ago
moxu ▴ 510

Or in R draw a histogram with "hist" on the underlying variable to take a look.

ADD COMMENT

Login before adding your answer.

Traffic: 1673 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6