Question

peak calling in CUT&RUN(Chipseq) Analysis

0

Entering edit mode

2.4 years ago

minoo ▴ 10

I have a couple of CUT&RUN samples with control and treatment conditions. This is my first time analyzing this type of data, so I need a little bit of help. I need to compare the peaks between control and treatment. To this end:

I need to compare the number of peaks in control and treatment samples, which I have no idea how to compute.

Then I think I need to apply peak calling using 'seacr', but I don't know how should be the input and how the output will look like, especially in case that I need to compare between samples. According to the help page they should be something like below:

bedtools bamtobed -bedpe -i $sample.bam > $sample.bed
awk '$1==$4 && $6-$2 < 1000 {print $0}' $sample.bed > $sample.clean.bed
cut -f 1,2,6 $sample.clean.bed | sort -k1,1 -k2,2n -k3,3n > $sample.fragments.bed
bedtools genomecov -bg -i $sample.fragments.bed -g my.genome > $sample.fragments.bedgraph

<output prefix>.stringent.bed OR <output prefix>.relaxed.bed (BED file of enriched regions)

bash SEACR_1.3.sh target.bedgraph IgG.bedgraph norm stringent output

I have no idea what should be exactly the input to compare control and treatment samples and how to use the output to draw heatmap plots. So I tried to skip this step and do as follows but I am going on the wrong direction and I need desperate help to fix this:

I used bamCompare function from deeptools to compare the control and treatment bam files and convert the result to a single bigwig file:

bamCompare -b1 treatment_sorted.bam -b2 control_sorted.bam -o control.vs.treatment.bw --scaleFactorsMethod None --normalizeUsing RPKM --blackListFileName ENCFF356LFX.bed.gz

Then I used the bamCompare output, to draw heatmap using control and treatment bed files using computeMatrix and PlotHeatmap:

computeMatrix scale-regions -S control.vs.treatment.bw -R control_sorted.bed treatment_sorted.bed -b 1000 --blackListFileName ENCFF356LFX.bed.gz --outFileName MAt.mat.gz

plotHeatmap -m MAt.mat.gz -o plotHeatmap.MAt.mat.gz.png --colorList blue,darkred

However I have no idea how to perform peak calling and where to use it in my situation. Any idea would be appreciated.

cutandrun chipseq SEACR cutandtag deeptools • 1.0k views

ADD COMMENT • link updated 2.4 years ago by jv ★ 1.8k • written 2.4 years ago by minoo ▴ 10

0

Entering edit mode

from SEACR github README.md

"it requires bedgraphs from paired-end sequencing as input, which can be generated from read pair BED files (i.e. BED coordinates reflecting the 5' and 3' termini of each read pair) using bedtools genomecov with the "-bg" flag, or alternatively from name-sorted paired-end BAM files as described in "Preparing input bedgraph files" below."

Your first code chunk example is detailing how to do this.

But before you can do this step you need to align your reads to your reference genome to generate sample.bam.

So your workflow would be something like: fastq QC --> align reads --> convert BAM to bedgraph --> call peaks with SEACR

ADD REPLY • link 2.4 years ago by jv ★ 1.8k