Hello, I'm currently following this paper Myc Regulates Chromatin Decompaction and Nuclear Architecture during B Cell Activation.
It is a ChIP-Seq to generate genome-wide maps of 34 chromatin modifications (17 acetylation marks, and 17 methylation marks) and the histone variant H2AZ. B cells were either naïve (G0) or activated for 24h in the presence of LPS and IL-4.
I'm trying to isolate peaks specific to naïve or activated B cells in each chromatin modifications.
In the publication I found these informations about the data :
BAM files:
50bps reads were sequenced on HiSeq2000/2500 Illumina platform. After quality control, ChIP-Seq reads were aligned agaisnt mm9 mouse reference genome by using alignment software Bowtie (with options <=> report reads that align uniquely to the best stratum and allowing up to 2 mismatches).
BIGWIG files :
Density tracks were generated using custom software based on the samtools library to count the number of reads in 100 bp windows normalized to window size and library size to obtain densities in units of reads per kb region per million mapped reads (rpkm) across the genome.
So, I assume they reduced the noise background using input files to create density tracks which are bigwig files
Bigwig are available under GEO : GSE82144
I will focus on H3K9me3
chromatin modification for this example :
- aB_wt_H3K9me3 : GSM2184251
- rB_wt_H3K9me3 : GSM2184287
Recovering data :
axel -q ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM2184nnn/GSM2184251/suppl/GSM2184251_aB_wt_H3K9me3.bw
axel -q ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM2184nnn/GSM2184287/suppl/GSM2184287_rB_wt_H3K9me3.bw
Visualizing my bigwig under IGV, set data range from 0 to 10 for both files
Seems like, i could grab some peaks by eyes.
As I only got bigiwig files, I throught that MACS2 would be my best option, using bedgraph files
Transform bigwig to bedgraph : http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/
./bigWigToBedGraph GSM2184251_aB_wt_H3K9me3.bw GSM2184251_aB_wt_H3K9me3.bedgraph
./bigWigToBedGraph GSM2184287_rB_wt_H3K9me3.bw GSM2184287_rB_wt_H3K9me3.bedgraph
Install MACS2 :
conda create -n yourenvname python=2.7 anaconda
source activate python2
conda install -c bioconda macs2
Run MACS2 :
macs2 bdgpeakcall -i GSM2184251_aB_wt_H3K9me3.bedgraph -o GSM2184251_aB_wt_H3K9me3_narrowPeaks
INFO @ Fri, 21 Dec 2018 12:09:25: Read and build bedGraph...
INFO @ Fri, 21 Dec 2018 12:09:36: Call peaks from bedGraph...
INFO @ Fri, 21 Dec 2018 12:09:36: Write peaks...
INFO @ Fri, 21 Dec 2018 12:09:36: Done
macs2 bdgpeakcall -i GSM2184287_rB_wt_H3K9me3.bedgraph -o GSM2184287_rB_wt_H3K9me3_narrowPeaks
INFO @ Fri, 21 Dec 2018 12:10:54: Read and build bedGraph...
INFO @ Fri, 21 Dec 2018 12:11:06: Call peaks from bedGraph...
INFO @ Fri, 21 Dec 2018 12:11:07: Write peaks...
INFO @ Fri, 21 Dec 2018 12:11:07: Done
If I open these two files :
more GSM2184251_aB_wt_H3K9me3_narrowPeaks
track type=narrowPeak name="GSM2184251_aB_wt_H3K9me3_narrowPeaks" description="GSM2184251_aB_wt_H3K9me3_narrowPeaks" nextItemButton=on
chr1 3027700 3027900 GSM2184251_aB_wt_H3K9me3_narrowPeaks_narrowPeak1 53 . 0.00000 0.00000 0.00000 100
chr1 3434900 3435200 GSM2184251_aB_wt_H3K9me3_narrowPeaks_narrowPeak2 56 . 0.00000 0.00000 0.00000 150
chr1 3660600 3661100 GSM2184251_aB_wt_H3K9me3_narrowPeaks_narrowPeak3 123 . 0.00000 0.00000 0.00000 50
chr1 3661300 3661500 GSM2184251_aB_wt_H3K9me3_narrowPeaks_narrowPeak4 80 . 0.00000 0.00000 0.00000 150
more GSM2184287_rB_wt_H3K9me3_narrowPeaks
track type=narrowPeak name="GSM2184287_rB_wt_H3K9me3_narrowPeaks" description="GSM2184287_rB_wt_H3K9me3_narrowPeaks" nextItemButton=on
chr1 3010900 3011100 GSM2184287_rB_wt_H3K9me3_narrowPeaks_narrowPeak1 54 . 0.00000 0.00000 0.00000 100
chr1 3084200 3084400 GSM2184287_rB_wt_H3K9me3_narrowPeaks_narrowPeak2 76 . 0.00000 0.00000 0.00000 150
chr1 3175300 3175500 GSM2184287_rB_wt_H3K9me3_narrowPeaks_narrowPeak3 73 . 0.00000 0.00000 0.00000 50
chr1 3530700 3531000 GSM2184287_rB_wt_H3K9me3_narrowPeaks_narrowPeak4 50 . 0.00000 0.00000 0.00000 150
According to the documentation
NAME_peaks.narrowPeak is BED6+4 format file which contains the peak locations together with peak summit, pvalue and qvalue. You can load it to UCSC genome browser. Definition of some specific columns are:
- 5th: integer score for display calculated as int(-10*log10qvalue). Please note that currently this value might be out of the [0-1000] range defined in UCSC Encode narrowPeak format
- 7th: fold-change
- 8th: -log10pvalue
- 9th: -log10qvalue
- 10th: relative summit position to peak start
Why are my fold-change value, pvalue and qvalue set to 0.0 on every record ?
After that I want to filter the specific peaks to G0 and activated B cells
The macs2 bdgdiff should works but specifications required a MACS control lambda bedGraph
file which is only available running macs2 callpeak option --bdg. This last does not accept bedgraph file so I am stuck...
macs2 callpeak accept bed files, so I tried :
macs2 callpeak -t GSM2184251_aB_wt_H3K9me3.bedgraph -c GSM2184287_rB_wt_H3K9me3.bedgraph -f BED --bdg --outdir callpeak
INFO @ Fri, 21 Dec 2018 12:32:49:
# Command line: callpeak -t GSM2184251_aB_wt_H3K9me3.bedgraph -c GSM2184287_rB_wt_H3K9me3.bedgraph -f BED --bdg --outdir callpeak
# ARGUMENTS LIST:
# name = NA
# format = BED
# ChIP-seq file = ['GSM2184251_aB_wt_H3K9me3.bedgraph']
# control file = ['GSM2184287_rB_wt_H3K9me3.bedgraph']
# effective genome size = 2.70e+09
# band width = 300
# model fold = [5, 50]
# qvalue cutoff = 5.00e-02
# The maximum gap between significant sites is assigned as the read length/tag size.
# The minimum length of peaks is assigned as the predicted fragment length "d".
# Larger dataset will be scaled towards smaller dataset.
# Range for calculating regional lambda is: 1000 bps and 10000 bps
# Broad region calling is off
# Paired-End mode is off
INFO @ Fri, 21 Dec 2018 12:32:49: #1 read tag files...
INFO @ Fri, 21 Dec 2018 12:32:49: #1 read treatment tags...
INFO @ Fri, 21 Dec 2018 12:32:50: 1000000
INFO @ Fri, 21 Dec 2018 12:32:52: 2000000
INFO @ Fri, 21 Dec 2018 12:32:53: 3000000
INFO @ Fri, 21 Dec 2018 12:32:55: 4000000
INFO @ Fri, 21 Dec 2018 12:32:56: 5000000
INFO @ Fri, 21 Dec 2018 12:32:58: 6000000
INFO @ Fri, 21 Dec 2018 12:32:59: 7000000
INFO @ Fri, 21 Dec 2018 12:33:01: 8000000
INFO @ Fri, 21 Dec 2018 12:33:02: #1.2 read input tags...
INFO @ Fri, 21 Dec 2018 12:33:04: 1000000
INFO @ Fri, 21 Dec 2018 12:33:05: 2000000
INFO @ Fri, 21 Dec 2018 12:33:07: 3000000
INFO @ Fri, 21 Dec 2018 12:33:08: 4000000
INFO @ Fri, 21 Dec 2018 12:33:10: 5000000
INFO @ Fri, 21 Dec 2018 12:33:11: 6000000
INFO @ Fri, 21 Dec 2018 12:33:13: 7000000
INFO @ Fri, 21 Dec 2018 12:33:14: 8000000
INFO @ Fri, 21 Dec 2018 12:33:15: #1 tag size is determined as 120 bps
INFO @ Fri, 21 Dec 2018 12:33:15: #1 tag size = 120.0
INFO @ Fri, 21 Dec 2018 12:33:15: #1 total tags in treatment: 8936534
INFO @ Fri, 21 Dec 2018 12:33:15: #1 user defined the maximum tags...
INFO @ Fri, 21 Dec 2018 12:33:15: #1 filter out redundant tags at the same location and the same strand by allowing at most 1 tag(s)
INFO @ Fri, 21 Dec 2018 12:33:16: #1 tags after filtering in treatment: 8936534
INFO @ Fri, 21 Dec 2018 12:33:16: #1 Redundant rate of treatment: 0.00
INFO @ Fri, 21 Dec 2018 12:33:16: #1 total tags in control: 8836601
INFO @ Fri, 21 Dec 2018 12:33:16: #1 user defined the maximum tags...
INFO @ Fri, 21 Dec 2018 12:33:16: #1 filter out redundant tags at the same location and the same strand by allowing at most 1 tag(s)
INFO @ Fri, 21 Dec 2018 12:33:16: #1 tags after filtering in control: 8836601
INFO @ Fri, 21 Dec 2018 12:33:16: #1 Redundant rate of control: 0.00
INFO @ Fri, 21 Dec 2018 12:33:16: #1 finished!
INFO @ Fri, 21 Dec 2018 12:33:16: #2 Build Peak Model...
INFO @ Fri, 21 Dec 2018 12:33:16: #2 looking for paired plus/minus strand peaks...
INFO @ Fri, 21 Dec 2018 12:33:19: #2 number of paired peaks: 0
WARNING @ Fri, 21 Dec 2018 12:33:19: Too few paired peaks (0) so I can not build the model! Broader your MFOLD range parameter may erase this error. If it still can't build the model, we suggest to use --nomodel and --extsize 147 or other fixed number instead.
WARNING @ Fri, 21 Dec 2018 12:33:19: Process for pairing-model is terminated!
Too few paired peaks (0)
I also tried --nomodel and --extsize 147
and I got the same result
I'm running in circles...
Thanks for your help
Hi! It's been a while but I'm stuck at the exact same point; did you find a solution after this?