Hello,
I would like to look for association of a binding site (from a ChIP-seq experiment) and some features (for example CpG regions) of nearby genes, and eventually do some statistic to check if the association found is at random or not. I am uising galaxy.
I uploaded the data in bed format; I retrieved CpG regions from UCSC, in the same format. As could be expected, the two datasets contain different number of lines.
Now I am stuck; I looked for correlations, but the program proposed in galaxy work only for a unique dataset. Same difficulty if I want to use graphic presentation.
I also tried to generate a unique dataset of regions common to both features, but as expected, both data become linked and a correlation is meaningless, in that case.
My questions:
1) is it something that is possible to study with galaxy ?
2) how would you proceed ?
Thanks for any help.
Have you seen this question How do you calculate if two sets of genomic regions overlap significantly? ? Is it similar to what you want?