.mcool to .hic
4
4
Entering edit mode
5.9 years ago
dimitrischat ▴ 210

Hi. I am trying to work out a pipeline for hi-c data analysis. https://hms-dbmi.github.io/hic-data-analysis-bootcamp/#1

Though i would prefer to use Juicebox for visualization. I am at the point where i have .mcool file (cooler) and i need .hic file so i can load it in Juicebox. Anyone knows how?

I tried using the Higlass web based app but it need .json files.. i have no idea

RNA-Seq • 15k views
ADD COMMENT
1
Entering edit mode

I am one of the main developers of HiGlass, and I agree that our setup could and should be easier. Where specifically do you get stuck? Also, feel free to join our slack channel (http://bit.ly/higlass-slack) if you need quick help.

ADD REPLY
0
Entering edit mode

maybe you can try pairLiftOver https://github.com/XiaoTaoWang/pairLiftOver

ADD REPLY
7
Entering edit mode
3.8 years ago

I've put together a slightly hacky and long winded way to go from .cool to .hic

Firstly, use the hicConvertFormat tool from the HiCExplorer package (which I installed using conda) to convert your .cool file into a ginteractions file:

hicConvertFormat -m /path/to/file.cool -o /path/to/file.ginteractions --inputFormat cool --outputFormat ginteractions

This will create the file file.ginteractions.tsv. Next, we will do some format preparation on this file to make it compatible with juicer pre. Documentation on the input formats accepted can be found here: https://github.com/aidenlab/juicer/wiki/Pre . I opted to make the file into short format with score, which has columns like:

<str1> <chr1> <pos1> <frag1> <str2> <chr2> <pos2> <frag2> <score>

The ginteractions file does not contain fragment or strand information, so I put dummy variables for those (since they are not used for the conversion to .hic anyway) and made sure that the dummy variables for frag1 and frag2 were different, using the following awk command:

awk -F "\t" '{print 0, $1, $2, 0, 0, $4, $5, 1, $7}' file.ginteractions.tsv > file.ginteractions.tsv.short

Sometimes this file will need to be sorted as juicer requires a specific chromosome ordering. So you can run:

sort -k2,2d -k6,6d file.ginteractions.tsv.short > file.ginteractions.tsv.short.sorted

I downloaded juicer tools from here: https://github.com/aidenlab/juicer/wiki/Download and set the following alias. However, you may need to increase the resourced allocated to the JVM for very large files:

alias juicer='java -Xms512m -Xmx2048m -jar path/to/juicer_tools_1.22.01.jar'

So that converting the short format with score file is done with the following:

juicer pre -r 10000,20000,50000,100000,250000,500000,1000000 /path/to/file.ginteractions.tsv.short.sorted /path/to/file.ginteractions.tsv.short.sorted.hic /path/to/chrom.sizes

Where the chrom.sizes file contains two columns: <chrom> <chrom size>. The -r flag here specifies the resolutions you would like your .hic file to include.

ADD COMMENT
1
Entering edit mode

Hi Charlotte, great to find you there! 🌈 You saved my day 🦄 Your solution worked well on the .cool files produced by the nf-core/hic pipeline.

ADD REPLY
0
Entering edit mode

Hi, it reports the error: """ ..Error: the chromosome combination 1_2 appears in multiple blocks """

ADD REPLY
0
Entering edit mode

Hi, yes sorry sometimes the file needs sorting first as seen here https://groups.google.com/g/3d-genomics/c/2w1OGHo5XdM . I will adjust my answer accordingly

ADD REPLY
5
Entering edit mode
5.9 years ago

You can load your mcool files in your own HiGlass server that you can launch on your local machine using higlass-manage as explained in their documentation:

http://higlass.io/docs

https://github.com/higlass/higlass-manage

I agree that it is not easy to setup but it is worth the troubles.

For .mcool to .hic, the other way around is supported by hic2cool. I don't think there is an official way of converting .cool or .mcool to .hic right now... which is very unfortunate. So, sadly enough you have to start everything from scratch using Juicer if you want to use their visualisation tools. Maybe you can come up with using your mapped reads and their Pre tool:

https://github.com/aidenlab/juicer/wiki/Pre#file-format

People are switching to HiGlass to visualize Hi-C data currently anyway it seems... the setup is more complicated than Juicebox but the outcome seems better in certain ways.

As an alternative you can use HiCBrowser to directly visualise cool files, you can have an example here:

http://chorogenome.ie-freiburg.mpg.de/

For further visualisation and data integration to make figures ready to publish you can use pyGenomeTracks (although the views are static):

https://github.com/deeptools/pyGenomeTracks

For long range interactions, pyGenomeTracks and HiCBrowser are not the best to use, HiGlass and Juicer are better for that. If you don't mind static views you can also use hicPlotMatrix from HiCExplorer, which is similar to cooler show:

https://hicexplorer.readthedocs.io/en/latest/content/tools/hicPlotMatrix.html#hicplotmatrix

ADD COMMENT
5
Entering edit mode
15 months ago
Ben B ▴ 50

I know this is an older post by now but I recently had to convert .mcool back to .hic and found it to be kind of a pain, so I wanted to provide my script in case it can be helpful to anyone else here. This extends Charlotte's answer to .mcools, which need to be handled a bit differently from cools.

Instead of HiCExplorer, I'm using Cooler to extract the interactions from the .mcool at the highest resolution available, then converting those interactions to .hic with JuicerTools.

#!/bin/bash

# Set the path to the input .mcool file
input_mcool=$1

# Set the path to the output .hic file
output_hic=${input_mcool%.*}.hic

# Set the path to the chrom.sizes file
chrom_sizes=~/ref_annots/hg19.chrom.sizes

# Set the path to the juicer_tools jar file
juicer_tools_jar=juicer_tools_1.22.01.jar


# Get the resolutions stored in the .mcool file
resolutions=$(h5ls -r $input_mcool | grep -Eo 'resolutions/[0-9]+' | cut -d '/' -f 2 | sort -n | uniq)
echo $resolutions
highest_res=$(echo $resolutions | tr ' ' '\n' | head -n 1)
echo "highest resolution: $highest_res"

# Use Cooler to write the .mcool matrix as interactions in bedpe format
output_bedpe=$(echo $input_mcool | sed "s/.mcool/.${highest_res}.bedpe/")
echo -e "cooler dump --join -r $highest_res $input_mcool::/resolutions/$highest_res"
cooler dump --join $input_mcool::/resolutions/$highest_res > $output_bedpe

# Convert the ginteractions file to short format with score using awk
awk -F "\t" '{print 0, $1, $2, 0, 0, $4, $5, 1, $7}' ${output_bedpe} > ${output_bedpe}.short

# Sort the short format with score file
sort -k2,2d -k6,6d ${output_bedpe}.short > ${output_bedpe}.short.sorted

# Convert the short format with score file to .hic using juicer pre
java -Xms512m -Xmx2048m -jar $juicer_tools_jar pre -r 1000,2000,5000,10000,20000,50000,100000,250000,500000,1000000 ${output_bedpe}.short.sorted $output_hic $chrom_sizes
ADD COMMENT
2
Entering edit mode
2.4 years ago
zsq.phy ▴ 20

I wrote a simple script cool2hic.py for transfering cooler to hic file.

https://github.com/zsq-berry/3D-genome-tools

ADD COMMENT

Login before adding your answer.

Traffic: 2027 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6