I had asked UCSC a similar question some years ago, and their answer suggests to look at the header information, or to find the provenance of the data going into creation of the original bigWig:
The original data that is used to generate a bigWig can come from different formats. There is bedGraph, which is zero-relative, and wiggle, which is 1-relative. In summary, if a bedGraph is used, the results from bigWigToWig will be the bedGraph zero-relative coordinates. What will be included in the output is a commented note, for example, "#bedGraph section chr1:10451-568419" at the head of the wgEncodeSydhTfbsK562Pol3StdSig file mentioned. Thus, the data is not re-indexed, unless you specify bigWigToBedGraph, then data will always return as 0-based bedGraph.
Most ENCODE data, such as the information you were looking at, originated from a bam, that was processed through a step like bamToBedfile.bam -> file.bedGraph bedGraphToBigWig -> file.bw Thus, there is no problem with this file, it should be what you see when looking at most bam originated bigWig files from the ENCODE project.
As to your last question, it is best to not rely on the fact all bigWigs will be indexed the same, some will be from bedGraphs, some from wigs, depending on their originating files, but likely all ENCODE data will exit bigWigToWig as bedGraphs since they were likely encoded as bedGraphs from bams.
Here is further background information. There are two bigWig encoders, bedGraphToBigWig and wigToBigWig, that can take bedGraph or the two wiggle types, variableStep and fixedStep. Then there are two ways back: bigWigToBedGraph and bigWigToWig. If you wish to explore with these formats, please see these pages, the last being the location for obtaining precompiled binaries:
Thank you for this information, I have this exact same problem. Do you know of a tool to access the header of a bigWig file?
Using Devon Ryan's Python library may help ( https://github.com/deeptools/pyBigWig ). Once installed:
Thanks you! does this require loading the whole file? (in this step
bw = pyBigWig.open("my.bigWig")
, sorry for the question, I have no experience in python) I was looking for something likesamtools view
to pipe tohead
, but for bigWig, so that I can avoid loading the fileNot sure about the answer to your first question, but the second seems straightforward. Create a text file called
readBigWigHeader.py
and add the following code or similar:Make the script executable (
chmod +x ./readBigWigHeader.py
), then run it like so to get the header sent to the standard output stream:Ok, thank you for the comprehensive help
It doesn't read the whole file in, it just reads in the parts needed like samtools. Please note that there's nothing in the header that indicates whether the underlying data is 1 or 0-based. This can actually change per-chunk within a bigWig file so there's really nothing to look at to know. As a general rule of thumb, it's best to assume that bigWig files are 0-based, since 1-based bigWig files are a terrible idea that should never have been allowed.