Question

Does read count ratio follow Gaussian distribution?

0

Entering edit mode

10.6 years ago

mangfu100 ▴ 810

Hi all.

I heard that most of CNV detection tools are based on read depth and they make Gaussian assumption about the distribution of read count ratio.

In this point of view, What is x-axis and y-axis meaning of Gaussian distribution? I could think y-axis as a read-depth count and X-axis as a position in exons? Is it right? then, each exon in exome sequencing follow Gaussian distribution?

With having above concept, I cannot connect concept above into sentence below. Could you look at it for advice?

Most of the existing tools for CNV calling that are based on read depth, such as ExomeCNV and CNV-seq, make Gaussian assumptions about the distribution of read count ratio. In the absence of technical variability, the proportion of reads matching to a specific sample should follow a binomial distribution whose success rate is determined by genome-wide read count ratio between the test sample and reference set.

I understood concept above as just one sequence sample follows Gaussian distribution while sample-to-sample follows binomial distribution.

Is it right that what I am understanding?

alignment next-gen-sequencing • 3.0k views

ADD COMMENT • link updated 3.3 years ago by Ram 45k • written 10.6 years ago by mangfu100 ▴ 810

0

Entering edit mode

Say we have a binomial distribution B(n,p). If n is large and both np and n(1-p) is not small, it can be approximated with a Gaussian distribution N(np,np(1-p)). See the wikipage of binomial distribution.

ADD REPLY • link 10.6 years ago by lh3 33k

Ram · Answer 1 · 2015-02-03

3

Entering edit mode

10.5 years ago

Chris Miller 22k

Choosing reads is actually a Poisson process, which, due to various technical and alignment biases, can be better represented with a negative binomial distribution (essentially an overdispersed Poisson). For a brief description of this and how it relates to read depth and CNV, see our write-up in this paper

ADD COMMENT • link updated 3.3 years ago by Ram 45k • written 10.5 years ago by Chris Miller 22k