How to use Hi-C data?
1
0
Entering edit mode
4.7 years ago

Hi

Im very new to Hi-C and Im not quite sure how to approach these data, as they look a bit stranger than Im used to from e.g. ChIA-PET data. In ChIA-PET data you get the regions that are interacting, but all the Hi-C data I have seen so far is presented like this (after raw sequencing data have been processed by e.g. HiC-pro):

HWI-D00119:283:HVWGBCXX:2:2113:7308:28159 chr1 10032 - chr13 47215551 -
HWI-D00119:284:HVWGBCXX:2:2205:16386:19423 chr1 10035 - chr7 13791889 -

As I understand this data position 10032 at chr1 interacts with 47215551 at chr13, but shouldnt it be regions instead for interaction start and interaction end instead of a specific position?

E.g. chr1 10032 10050 chr13 47215551 47215600

Is the position from the Hi-C where the biotin-labeled nucleotide is found on each restriction fragment?

Is there any way to convert these data into bed files displaying these interactions? What I very much would like to do with these data is to see whether my list of CpGs in one interacting end is forming interaction with a gene in the other end.

Thanks in advance!

next-gen genome sequencing • 1.4k views
ADD COMMENT
0
Entering edit mode

Hi everyone

I read that Framtid1994 did a ChIA-PET analysis and i have some questions about of how can i recognized the linker sequences in tha fastq file? I read the some Encode procedures, but i didn't get find this there. Someone can help me to understand that? I am trying to use Mango pipeline for some Encode ChIA-PET from CTCF and RAD21 and like all ChIA-PET analysis it needs the linker sequence. Sorry to ask this in this post, but i think that you can help me to understand this.

ADD REPLY
0
Entering edit mode
4.6 years ago

Its too late to answer but may be useful to others.

They are just the position of reads (ValidPairs) of a ligated fragment. They can not be considered as a real "interactions" unless you call statistically significant interactions. Usually the genome is binned (5KB, 10KB etc ) and then the reads are overlapping each bin are counted and which undergoes a statistical testing. Hi-C is not at basepair resolution , so its always a window which depends on the approximate resolution of the data.

Hi-C pro also has hicpro2juicebox.sh for Juicebox compatibility

ADD COMMENT

Login before adding your answer.

Traffic: 2597 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6