All CpG sites in human genome RRBS
2
1
Entering edit mode
6.1 years ago
Pin.Bioinf ▴ 340

Hello,

I was asked to find which CpGs are unique of EPIC 850K methylation array that are not observed by RRBS (reduced representation bisulfite sequencing). I have EPIC 850K manifest, but is there a public site where I can download all coordinates/locations for all CpGs detected by RRBS?

Do you think there will be CpGs that EPIC 850K covers that RRBS does not?

Thanks a lot for your help.

BSSeq RRBS Genome Human • 2.3k views
ADD COMMENT
7
Entering edit mode
6.1 years ago

There won't be an exact list of CpGs covered by RRBS because it depends on the exact enzymes used and how tightly you perform size selection. I would propose that you perform the following procedure:

  1. Use biopython to determine all possible fragments generated by the restriction enzymes you'll be using (there are some convenient functions for performing restriction digests on sequences in that package).
  2. Determine a rough range of sequencable fragments, which will likely be something like 75-500 bases.
  3. Choose a read length (N), because the results of all of this will be length-dependent.
  4. For each of the fragments you selected from step 2, write the regions corresponding to the first/last N bases to a file in BED format.
  5. Load the BED file from step 4 into an interval tree (there might be something in biopython for this, worst case scenario you can use deeptoolsintervals from deepTools).
  6. Use biopython to iterate over the CpGs and query them for overlaps with the interval from step 5.
  7. Write output files appropriately
  8. Compare them to what the EPIC 850K covers.

Note that the EPIC 850K may give a ballpark estimate of all of this in their sales materials. I wouldn't be surprised if the EPIC 850K covers some CpGs that RRBS doesn't.

ADD COMMENT
0
Entering edit mode

I agree that RRBS coverage will vary (but I think the number of shared sites at 10X coverage is a useful QC metric).

So, even if you find the conclusions are a little different for your own experiment, perhaps it is worth taking a look at this Carmona et al. 2017 paper?

ADD REPLY
0
Entering edit mode
5.6 years ago
Illinu ▴ 110

Hi Pin.Bioinf, If you are still interested, you can email services@diagenode.com and they can give you a list of all CpGs detected in human samples with the Diagenode Premium RRBS Kit. Best, Sol

ADD COMMENT

Login before adding your answer.

Traffic: 1625 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6