Hi to all,
Can there be overlaps in a bigWig or bigBed file?
Thanks in advance, Burcak
Hi to all,
Can there be overlaps in a bigWig or bigBed file?
Thanks in advance, Burcak
yes
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2922891/
A BigBed file can contain overlapping intervals
You can use Nezar Abdennur's pybbi
library to check this for yourself, for regions of interest:
$ source activate some_environment
(some_environment) $ pip install pybbi
...
(some_environment) $ python
...
Then look for overlaps at a position (or a range):
>>> import bbi
>>> bbi.fetch("measures.optimal.noDupes.LRcheck.bed9.bigBed", "chr3", 25431691, 25431692)
array([8.])
This tells you that there are eight overlapping elements in the specified bigBed file at the single-base position chr3:25431691-25431692. (Because it is single-base, those eight elements must necessarily overlap.)
If you want to know what they are:
>>> list(bbi.fetch_intervals("measures.optimal.noDupes.LRcheck.bed9.bigBed", "chr3", 25431691, 25431692))
[('chr3', 25431673, 25431733, '0.636|TGTGACAACTTCTTGGCCTTTTTCCCTGTCTTTTCTCCCTACTCAACACATCAAAAGGAA', '0', '+', '25431673', '25431733', '254,218,121'), ('chr3', 25431675, 25431735, '0.705|TGACAACTTCTTGGCCTTTTTCCCTGTCTTTTCTCCCTACTCAACACATCAAAAGGAAAA', '0', '+', '25431675', '25431735', '253,175,74'), ('chr3', 25431682, 25431742, '0.864|TTCTTGGCCTTTTTCCCTGTCTTTTCTCCCTACTCAACACATCAAAAGGAAAAAGAAAAA', '0', '+', '25431682', '25431742', '233,39,31'), ('chr3', 25431683, 25431742, '0.864|TCTTGGCCTTTTTCCCTGTCTTTTCTCCCTACTCAACACATCAAAAGGAAAAAGAAAAA', '0', '+', '25431683', '25431742', '233,39,31'), ('chr3', 25431683, 25431743, '0.864|TCTTGGCCTTTTTCCCTGTCTTTTCTCCCTACTCAACACATCAAAAGGAAAAAGAAAAAA', '0', '+', '25431683', '25431743', '233,39,31'), ('chr3', 25431685, 25431744, '0.886|TTGGCCTTTTTCCCTGTCTTTTCTCCCTACTCAACACATCAAAAGGAAAAAGAAAAAAA', '0', '+', '25431685', '25431744', '222,22,29'), ('chr3', 25431686, 25431744, '0.905|TGGCCTTTTTCCCTGTCTTTTCTCCCTACTCAACACATCAAAAGGAAAAAGAAAAAAA', '0', '+', '25431686', '25431744', '210,14,32'), ('chr3', 25431691, 25431744, '0.974|TTTTTCCCTGTCTTTTCTCCCTACTCAACACATCAAAAGGAAAAAGAAAAAAA', '0', '+', '25431691', '25431744', '152,0,38')]
All you have to do is iterate over a list of single-base positions over your genome of interest (or some sensible subset) to determine where bbi.fetch
returns a positive integer greater than one.
If you fetch over a range wider than a single base than you may need to do operations to check for overlaps within tuples.
BigBed files contain genomic intervals, so it is possible for them to contain overlaps within a file. BigWigs are continuous signal, so you might need to fetch intervals from multiple bigWigs to see if they intersect.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks for the paper for reference. For BigBed files, it writes that it can contain overlapping intervals. Is there such an information for BigWig files? I could not see in that paper.
Thanks again.
You can use
pybbi
to check both bigWig and bigBed files for overlapping intervals. See my answer for more details, or take a look at the documentation. You could use this library to fetch intervals from multiple bigWig files and intersect their tuples.