Entering edit mode
6.6 years ago
gbdias
▴
160
Hi,
Does anyone have a strategy for figuring out which restriction enzyme was utilized in a HiC experiment based on the Illumina reads alone?
Thanks
Can you go into further detail how you figured it out gbdias? I am in the same situation as you were. There are 20 over-represented 7-mers located near the start of my reads. (Both for forward reads and reverse reads). How do I used this data to deduce the restriction enzyme cutting site?
Check if the first 2-3 bases are common and then make a palindrome of that. For example, if the k-mers tend to start with
TC
thenGATC
is the cut site and DpnII was used.@gbdias I have tried this approach. I have got the fastqc file reports. I get the k-mers that start with GATC or other enzyme's cut site (example:HindIII-AGCTT (A^AGCTT)) without having to generate a palindrome with the first 2-3 common bases. I was wondering if I can consider the enzyme from the k-mers which are mostly the appropriate cut sites. Is there any reference or standard with which we can be sure that the overrepresented sequences contain the k-mers of the enzyme restriction sites. I would want to know why we consider the overrepresented sequences in particular