Use R to Produce CpG Information
0
0
Entering edit mode
4 months ago
Dana • 0

Hello Everyone,

I hope all is well! I would like to create an R script to obtain the following information from 72 CpG sites:

1.Chromosome

  1. Position
  2. Type of CpG site (e.g., island, shore, etc.)
  3. List of all SNPs +/- 500 base pairs for site
  4. Nearest gene(s)

Does anyone have any suggestions? Please let me know, as I would be most obliged!

Thank you,

Dana

R CpG • 684 views
ADD COMMENT
1
Entering edit mode
ADD REPLY
0
Entering edit mode

How do I reference the specific CpG sites for which I need the information? I only see getting information for an entire chromosome.

ADD REPLY
0
Entering edit mode

Please make a reproducible question. Add examples of which data you have right now and in which format, and how output should look. "CpG" sites are not informative, we need the know how they're represented.

ADD REPLY
0
Entering edit mode

I have a list of 72 cg markers.

ADD REPLY
0
Entering edit mode

Reiterating a point that already exists in the question doesn't appear to fit what they are asking for. Here is a guide to making a minimal reproducible example, but the logic applies to asking non-programming questions too.

ADD REPLY
0
Entering edit mode

What data do you have? Are the methylation data in VCF format? Do you have that variant calling data, or do you need to get that from somewhere else? Hard to make suggestions when all we know is that you have 72 CpG sites from an unknown species, and no idea about supplementary data. But generally, if you have all the relevant VCFs, this is relatively easy to do with a function like data.table::foverlaps or library(granges) for most of that information.

ADD REPLY
0
Entering edit mode

I just have human CpG site numbers. I wanted to use data from NCBI or UCSC.

ADD REPLY
0
Entering edit mode

Also, granges requires chromosome information to work.

ADD REPLY
0
Entering edit mode

Please use ADD REPLY and not the answer field. That makes the thread messy.

ADD REPLY
0
Entering edit mode

Site numbers that include chromosome information? What format are the 72 sites in? And there is a lot of data on NCBI. What kind of data are you looking to use? Do you want to download raw reads, process them, and call SNPs? Please see the comment I left above about minimal reproducible examples.

ADD REPLY
0
Entering edit mode

The site numbers are the CpG IDs. They are in Illumina850.

ADD REPLY

Login before adding your answer.

Traffic: 1564 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6