In the following paper it is written:
Average ChIP-Seq tag counts were calculated in windows of 50 bp for a region of 5 kb up- and down-stream of the orientated transcription start sites (TSS). Tag counts were normalized globally, as a fold increase over the genome average tag count in a window of 50 bp for the following modifications:...
I am trying to reproduce it. So, they took a window of 50bp, calculated the number of reads falling into these region and then devided by 50. But they still plot the coverage per base not per window, right? So, they mapped the average chip-seq counts calculated in 50 bp region back to each base, didn't they? I am also not quite sure how they normalized the counts.
If
i
is a number of bins/windows and size of each bin/window is 50bp then -5kb away from TSS will have only 100 windows. The first window will start 5kb away from TSS and end 4950 away from TSS. Next window will start at 4950 from TSS away and end at 4900 etc. If they have calculated coverage for 100 windows, how could they plot the coverage for each base pair (-5kb away)?Ai is all windows 4950 bases away from the TSS (to choose a random distance). What's plotted at that position is the average of the ratio (or maybe the ratio of the average value, they probably specify that somewhere).
I guess my problem was that I assumed they used non-overlapping windows. Thanks