measure the distribution bias in genomic features
0
0
Entering edit mode
4.9 years ago
Hughie ▴ 30

Hi everyone,

I'm recently analyzing DNA methylation data and facing an obstacle problem here:

As we know that the DNA methylation distribution can vary differently in genomic features (core promoter, enhancer, CpGIsland, etc). I want to measure the distribution bias among these genomic features now.
In other words, I want to know the deviation between expected and observed DNA methylation sites number?

I read some papers and found various methods used in this analysis, for example, independent t-test, Chi-square test, Mann-Whitney U test, permutation test, etc, which made me really confused on choosing.

I have tried the independent t-test and calculated the ratio = log2(mean of observed/mean of expected) for plotting heatmap (In this result, if the ratio > 0, I will say DNA methylation occurs more often in this region and vice verse). However, someone told me that the Chi-square test may better on measuring the difference between observed and expected. I also tried this too. However, I can only get a chi-value for each genomic feature, which varies a lot (from 300 - 40000000), difficult for visualization.

So, I have several questions:

  1. Which methods do you think is better for this kind of problem?
  2. If Chi-square distribution is used, how to properly handle the chi-value for visualization (normalize the chi-square of each region to a random region?)
  3. I noticed the p-value is typically small (10e-100 often and even 0 reported), I referred some answer on how to handle very large dataset for statistical test and find there are no clear conclusions. So, if you make statistical test on a very large dataset (typically, sample size in the level of 10e6 is usual in bioinformatics), how do you handle the very small p-value?

Thanks for your time, really appreciate any answers!

Statisticas • 783 views
ADD COMMENT

Login before adding your answer.

Traffic: 1768 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6