Hello! Any tool for the downstream analysis of the HiC Sequencing data requires the restriction enzyme which was used for generating the HiC data. Is there any way to get it from the Sequencing data? Can I get any information from the fastq files?
Hello! Any tool for the downstream analysis of the HiC Sequencing data requires the restriction enzyme which was used for generating the HiC data. Is there any way to get it from the Sequencing data? Can I get any information from the fastq files?
Hi,
DpnII is/was used in my case, I had 'GATCGATC' ligation sites in my reads.
A good way to check this is to map your reads against a reference genome, and check if your mapped reads are split on GATC-GATC (or other enzyme site). The overhangs of the enzyme cut sites are filled in and ligated, so you can determine the enzyme site this way. Or you can ask the supplier of the data off course ;-)
@mvk Thank you so much for clarifying. Is there any way I can get it from the fasta sequences of the fastq files of the data? Generally the knowledge of the enzyme is essential from the start of the analysis for HiC-data. I will also have to give the ligation site as one of the parameters for some tools. I create a digested reference genome bed file based on the digestion enzyme used. Then give it along with the data to a tool which does mapping and report the valid pairs. So I am interested to know if I can get any idea about the enzyme from the fastq files itself. Also which mapping tool can be used if I will have to follow the approach you mentioned about.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Figuring out HiC restriction enzyme