I am trying to call peaks for a broad histone mark and I am using MACS2 for Broad peaks (MACS2 bdgbroadcall) which uses the bedgraph file produced by MACS2.
The problem is the output file of MACS2 bdgbroadcall which is apparently a BED file with 15 columns with no header so I don't know what I am looking at except for the chromosome number and peak location I guess.
This format is used to provide called regions of signal enrichment based on pooled, normalized (interpreted) data where the regions may be spliced or incorporate gaps in the genomic sequence. It is a BED12+3 format.
chrom - Name of the chromosome (or contig, scaffold, etc.).
chromStart - The starting position of the feature in the chromosome or scaffold. The first base in a chromosome is numbered 0.
chromEnd - The ending position of the feature in the chromosome or scaffold. The chromEnd base is not included in the display of the feature. For example, the first 100 bases of a chromosome are defined as chromStart=0, chromEnd=100, and span the bases numbered 0-99.
name - Name given to a region (preferably unique). Use '.' if no name is assigned.
score - Indicates how dark the peak will be displayed in the browser (0-1000). If all scores were '0' when the data were submitted to the DCC, the DCC assigned scores 1-1000 based on signal value. Ideally the average signalValue per base spread is between 100-1000.
strand - +/- to denote strand or orientation (whenever applicable). Use '.' if no orientation is assigned.
thickStart - The starting position at which the feature is drawn thickly. Not used in gappedPeak type, set to 0.
thickEnd - The ending position at which the feature is drawn thickly. Not used in gappedPeak type, set to 0.
itemRgb - An RGB value of the form R,G,B (e.g. 255,0,0). Not used in gappedPeak type, set to 0.
blockCount - The number of blocks (exons) in the BED line.
blockSizes - A comma-separated list of the block sizes. The number of items in this list should correspond to blockCount.
blockStarts - A comma-separated list of block starts. The first value must be 0 and all of the blockStart positions should be calculated relative to chromStart. The number of items in this list should correspond to blockCount.
signalValue - Measurement of overall (usually, average) enrichment for the region.
pValue - Measurement of statistical significance (-log10). Use -1 if no pValue is assigned.
qValue - Measurement of statistical significance using false discovery rate (-log10). Use -1 if no qValue is assigned.
ADD COMMENT
• link
updated 5.0 years ago by
Ram
44k
•
written 8.9 years ago by
dally
▴
210
MACS README explains the output columns. Scroll down to the "Output files" section.
NAME_peaks.narrowPeak is BED6+4 format file which contains the peak locations together with peak summit, pvalue and qvalue. You can load it to UCSC genome browser. Definition of some specific columns are:
5th: integer score for display
7th: fold-change
8th: -log10pvalue
9th: -log10qvalue
10th: relative summit position to peak start
NAME_peaks.broadPeak is in BED6+3 format which is similar to narrowPeak file, except for missing the 10th column for annotating peak summits.
Thanks, I will try to do it the broad parameter.