Question

Figuring out HiC restriction enzyme

1

Entering edit mode

6.7 years ago

gbdias ▴ 160

Hi,

Does anyone have a strategy for figuring out which restriction enzyme was utilized in a HiC experiment based on the Illumina reads alone?

Thanks

Hi-C HiC scaffolding • 4.1k views

ADD COMMENT • link updated 3.2 years ago by nkmalini97 • 0 • written 6.7 years ago by gbdias ▴ 160

score 5 · Accepted Answer · 2018-04-09

5

Entering edit mode

6.6 years ago

gbdias ▴ 160

Figured it out by checking the overrepresented k-mers at the FASTQC report for my data.

ADD COMMENT • link 6.6 years ago by gbdias ▴ 160

0

Entering edit mode

Can you go into further detail how you figured it out gbdias? I am in the same situation as you were. There are 20 over-represented 7-mers located near the start of my reads. (Both for forward reads and reverse reads). How do I used this data to deduce the restriction enzyme cutting site?

ADD REPLY • link 6.3 years ago by joneill4x ▴ 160

2

Entering edit mode

Check if the first 2-3 bases are common and then make a palindrome of that. For example, if the k-mers tend to start with TC then GATC is the cut site and DpnII was used.

ADD REPLY • link 6.3 years ago by Devon Ryan 104k

0

Entering edit mode

@gbdias I have tried this approach. I have got the fastqc file reports. I get the k-mers that start with GATC or other enzyme's cut site (example:HindIII-AGCTT (A^AGCTT)) without having to generate a palindrome with the first 2-3 common bases. I was wondering if I can consider the enzyme from the k-mers which are mostly the appropriate cut sites. Is there any reference or standard with which we can be sure that the overrepresented sequences contain the k-mers of the enzyme restriction sites. I would want to know why we consider the overrepresented sequences in particular

ADD REPLY • link 3.2 years ago by nkmalini97 • 0