How to read Hi-C result from paper 'Bing Ren'
2
4
Entering edit mode
10.0 years ago
jesselee516 ▴ 100

Hi all, I am reading paper'A high-resolution map of the three-dimensional' chromatin interactome in human cells. I downlaod their Hi-C data from GEO(GSM1154024) for IMR90 Cell Line.This is used for genome regions interaction.I download the .txt version. The line in this file looks like as follows:

HWI-ST216_0375:2:1206:9502:51416#0 chr1 148 - chr19 63789745 -
HWI-ST216_0375:2:1215:19504:87026#0 chr1 6071 - chr13 50507738 -
...

As reading above result, I get confused. I know two regions are interacting, like

(chr1 148 -)<->(chr19 63789745-), this two regions should be interacting. But it should be two regions instead of position.I do not know what does 148 and 63789745 means in this dataset.My familiar format should be like (148-500)<->(63789745-63789999),a region map to another region. Any one could help me out how to read result from this paper? Thanks.

gene next-gen-sequencing genome • 20k views
ADD COMMENT
6
Entering edit mode
9.9 years ago
Gjain 5.8k

Hi Jesse,

In order to understand this paper, you need to first understand the 3C-based technologies and how the experiment is performed. Please have a look at this review which will help you understand these technologies. You specifically want to focus on 3C part and then move to the HiC the part.

A decade of 3C technologies: insights into nuclear organization

You also want to read and understand the original HiC paper to understand what a looping interaction means: Comprehensive mapping of long range interactions reveals folding principles of the human genome

Overview of Hi-C.

(A) Cells are cross-linked with formaldehyde, resulting in covalent links between spatially adjacent chromatin segments (DNA fragments: dark blue, red; Proteins, which can mediate such interactions, are shown in light blue and cyan). Chromatin is digested with a restriction enzyme (here, HindIII; restriction site: dashed line, see inset) and the resulting sticky ends are filled in with nucleotides, one of which is biotinylated (purple dot). Ligation is performed under extremely dilute conditions to create chimeric molecules; the HindIII site is lost and a NheI site is created (inset). DNA is purified and sheared. Biotinylated junctions are isolated with streptavidin beads and identified by paired-end sequencing.

(B) Hi-C produces a genome-wide contact matrix. The submatrix shown here corresponds to intrachromosomal interactions on chromosome 14. Each pixel represents all interactions between a 1Mb locus and another 1Mb locus; intensity corresponds to the total number of reads (0-50). Tick marks appear every 10Mb.

(C, D) We compared the original experiment to a biological repeat using the same restriction enzyme (C, range: 0-50 reads) and to results with a different restriction enzyme (D, range: 0- 100 reads, NcoI).

The topological domains comes later which are basically regions of genome where the elements involved in looping tends to happen in one domain.

I hope this helps a bit.

ADD COMMENT
0
Entering edit mode

Hi Gjain, Thanks a lot. It did help me a lot.

ADD REPLY
0
Entering edit mode

I am happy to help.

ADD REPLY
1
Entering edit mode
9.9 years ago

I suggest that those are the coordinates of corresponding HindIII sites that are interacting. Also note that raw reads should be processed accordingly, i.e. they are usually binned to 500kb regions, the corresponding interaction matrix is normalized for biases and smoothed. See Tanay's group web page for the pipeline and details.

ADD COMMENT

Login before adding your answer.

Traffic: 1797 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6