Via a couple Kent tools and BEDOPS, convert the bigWig file to sorted BED:
$ bigWigToBedGraph input.bw input.bedgraph
$ awk '{ print $1"\t"$2"\t"$3"\tid-"NR"\t"$4; }' input.bedgraph | sort-bed - > input.bed
Get the chromosomal bounds for your genome build of interest, e.g. hg19
, and convert them into sorted BED:
$ fetchChromSizes hg19 > hg19.bounds.unsorted.txt
$ awk '{ print $1"\t0\t"$2; } hg19.bounds.unsorted.txt | sort-bed - > hg19.bounds.bed
Once you prepare your inputs as BED files, you can use a one-liner to measure signal in sliding windows or bins.
For example, use bedops --chop
and --stagger
to split up the bounds into, say, 1000 base increments, staggered every 100 bases — basically a sliding window 1000 bases wide, that is positioned at every 100 bases. Pipe this to bedmap
to map against the signal converted bigWig file, taking the mean signal over the split windows:
$ bedops --chop 1000 --stagger 100 hg19.bounds.bed | bedmap --faster --echo --mean --delim "\t" --skip-unmapped - input.bed > answer.bed
You can put in whatever values you want for --chop
and --stagger
to decide how finely- or coarsely-grained you want to smooth the signal. For instance, a --stagger
value of 0
(or leaving out this option) would change the analysis from a sliding window to measuring signal over disjoint bins.
You can use other measurements than --mean
. See bedmap --help
or take a look at the documentation for a description of all the signal- or score-based operands.
Hey. Thanks to all for this post. iamjli: I was inspired by your advice to try this with deepTools, but I ended up using bigwigCompare because the output can be in bigwig, rather than multiBigWigSummary which outputs npz or a tab file. I just used the "mean" operation on the same bigwig file as my -b1 and -b2, then I used the -binSize option to so the smoothing. I was using chip-chip data with info every 50bp so I used a binsize of 150bp to do the smoothing.
if you use bigwigCompare , that would not create a sliding window, but instead, it will output the mean for each bin, according to the -binSize. I don't think this is quite the same as smoothing...