I have the methylation data from human methylation450k. Now I am trying to calculate the distance between CpG site and TSS. I need all the possible association between gene and CpG, as well the nearest gene. Is there a way to do this fast?
I have the methylation data from human methylation450k. Now I am trying to calculate the distance between CpG site and TSS. I need all the possible association between gene and CpG, as well the nearest gene. Is there a way to do this fast?
If your data is in bed format, and you also have a list of TSS in bed format, you can use bedtools to find the closest TSS for each CpG site and vice versa. You can Google about bedtools and bed format.
You could GALAXY which is a simple to use graphical interface for using bioinformatic tools. If your files contain chromosome, start and end coordinates you can find the closest TSS to each CpG (using 'Fetch closest non-overlapping feature') and then calculate the distance (using the 'compute' tool).
I advise following along with the Galaxy101 tutorial to get yourself familiar with the layout.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.