Can we do a statistical test of a region against the genome?
0
0
Entering edit mode
8.7 years ago
scchess ▴ 640

Let's say I measure the overall average coverage per base for my genome be 30x. Now, I have a region spanning 100 bases. This region might be a region where I'm interested to compare with the genome. Let's say my average coverage per base in this base is 40x. I want to ask a question: "is my region statistically different to the genome in terms of coverage per base?"

How should I approach this problem? Can I do a t-test?

dna genome • 1.8k views
ADD COMMENT
0
Entering edit mode

Do you have multiple samples? Otherwise, you are only comparing two numbers, which is difficult to perform any form of statistic on.

ADD REPLY
0
Entering edit mode

I have the genome coverage like { 56, 67, 89 ... } then I have coverage per base in my region like {67, 89, 90 ... }. The size of the genome region and my region can be different. My question is, whether the average coverage in my region is different to the genome coverage.

ADD REPLY
0
Entering edit mode

Thus, I do have standard deviation and all the data.

ADD REPLY
0
Entering edit mode

From the number you've provided, it seems like the coverage is the average coverage of the region? If that is the case, maybe you can indeed try to do a t-test on it? However, it is always safe to plot the distribution of your data first. If they are not normally distributed, then you might need to use something else.

ADD REPLY

Login before adding your answer.

Traffic: 2001 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6