I usually think of a "pileup" file as a file made from a bam/sam alignment that lists the coverage at each coordinate. This is what samtools pileup
and samtools mpileup
make.
For instance, here is the output of samtools pileup
:
#chr coord base count
1 9998 n 1
1 9999 n 1
1 10000 n 4
1 10001 t 5
1 10002 a 7
1 10003 a 7
macs2 callpeaks
produces what it calls a "pileup," for example *_treat_pileup.bdg
that looks like:
1 15098 15104 4.74683
1 15104 15142 5.69620
1 15142 15178 4.74683
1 15178 15188 3.79747
1 15188 15192 4.74683
1 15192 15224 3.79747
1 15224 15245 2.84810
1 15245 15251 1.89873
1 15251 15277 0.94937
1 15277 15303 1.89873
1 15303 15314 2.84810
1 15314 15329 3.79747
1 15329 15335 4.74683
1 15335 15392 3.79747
1 15392 15424 4.74683
1 15424 15450 3.79747
1 15450 15461 2.84810
1 15461 15476 1.89873
This has coordinate ranges of varying widths and fractional numbers.
However, the README, basically the only documentation does not define the output format.
But those are not counts under peaks (which I call coverage, such as is calculated with
bedtools coverage
). Those gaps are just small gaps (0-100 bp) in coordinates. And, the 4th column isn't coverage--it's a decimal number that measures _something_, but I don't know what and it's not documented.see this post: https://groups.google.com/forum/#!searchin/macs-announcement/callpeak$20pileup%7Csort:relevance/macs-announcement/F4ZQMqhD-N4/gw2-V6l0CQAJ
so the callpeak should be equivalent to "pileup", barring the normalization factor. And probably it is also normalized per million of mapped reads. I am wondering if you used downsampling
--down-sample
?