I'm trying to use the new Hi-C analysis package HiFive http://bxlab-hifive.readthedocs.org/en/latest/fragment_handling.html.
Unfortunately I can't get past the first step - creating the fend (fragment end) object.
The documentation says you can either input a bed file or a HicPipe compatible file containing the chromosome, start, and end of every restriction fragment in the genome.
I used Bowtie2 and then HiCup_Digester http://www.bioinformatics.babraham.ac.uk/projects/hicup/quickstart/ to create a file containing information about these fragments. This file was used successfully later in the HiCup pipeline, so I don't think there's anything wrong with it.
I then removed the extra columns and header so it could be inputted as a bed file, shown below. It is tab delimited;
chr1 1 16007
chr1 16008 24571
chr1 24572 27981
chr1 27982 30429
chr1 30430 32153
chr1 32154 32774
chr1 32775 37752
chr1 37753 38369
chr1 38370 38791
chr1 38792 39255
I then tried to create a fend object from this file
fend = hifive.Fend('test_fend', mode='w')
fend.load_fends('2015_09_02_GRCh38_HindIII_fend.bed', genome_name='GRCh38',re_name='HindIII',format='bed')
fend.save()
But I get this error;
File ".../hifive-1.0-py2.7-linux-x86_64.egg/hifive/fend.py", line 259, in _load_from_bed
fends['start'][pos:(pos + data_len)] = data[chrom]['start'][:]
ValueError: could not broadcast input array from shape (65299) into shape (130598)
I then tried adding column names 'chr' 'start' 'stop' just in case headers were needed, but I got exactly the same error.
Any insight would be much appreciated! It looks like a great package and I'd really like to use it.
Hi,
I took the digested genome file from hicup and extracted the fields for chr start end. Then added the following columns: name score strand gc and mappability. name score and strand each had only "." elements and gc and mappability have only "0,0" elements. I then ran
hifive fends -B testfends.bed test.fends
and it worked.