Entering edit mode
4.4 years ago
star
▴
350
I downloaded a HiC normalized contact matrix like below (a subset of dig table). it is based on a 10kb resolution. The rows and columns should be the genomic coordinates based on 10000bp interval (e.g: Chr 18 length is 78,022,248 bp and the dimension of Chr 18 contact matrix is 7803*7803).
df_subset
V1 V2 V3 V4 V5 V6
1 0.000000 0.000 0.00000 93.25807 0.000000 823.45363
2 0.000000 0.000 0.00000 1063.53307 0.000000 0.00000
3 0.000000 0.000 0.00000 0.00000 0.000000 415.63635
4 93.258072 1063.533 0.00000 0.00000 0.000000 68.68992
5 0.000000 0.000 0.00000 0.00000 0.000000 0.00000
6 823.453631 0.000 415.63635 68.68992 0.000000 0.00000
I am a bit confused, it is the first time that I would like to work with HiC data. So I would like to find the interaction between locations and I faced with some questions:
- how can I consider the genomic locations (based on 10 kb resolution), Cal I use
seq (start, end,by=10000)
function afterward put it as colnames and rownames? - How can I understand which interactions are significant?
- is the coordinate in contact matrix sorted?
Many thanks in advance!
Hi Star, I now came across the same "significant contact calling" problem as you, so I wonder whether you found the answer already? Many thanks!