Which strategy to analyse scores on multiple DNA positions over samples ?
0
0
Entering edit mode
4.9 years ago

Hi,

I'm writing here today because I'm looking for a strategy to analyse my dataset.

Rapidly, we have a dataset of WGBS, with methylation scores. In that subset, we're looking for methylation scores for 20 positions, which are low (between 0 and 5%) over >50 samples. I'd like to know whether it is the same samples which are always the "most" methylated.

I have a dataset with columns : position sample methylation_score

I don't really have an idea on how to carry my analysis on. This is where I stopped : - the scores are low (but probably have a functional role for what we are looking for) - the distribution may not be homogeneous amongst samples and positions - I thought about using a ranking test (like a correlation ranking test like spearman, but I'm blocked by the fact I have there two qualitative data : position and samples. I though about PCA, but I only have one quantitative dimension. I thought about Kruskall-Wallis, that gives me a significant p-value Then I tried to rank all the scores and give a score based on the normalised rank of the methylation score, I'm not really sure of this approach.

So, how would you set up a strategy to know basically if it's the same samples that are likely to be amongst "the most methylated" positions ?

I hope I'm clear enough, otherwise, please tell me how can I refine what I'd like to achieve

Best,

DNA WGBS methylation statistics • 643 views
ADD COMMENT

Login before adding your answer.

Traffic: 2455 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6