Hi, All:
I'm now tring to using MACS2 call peaks for CHIP-seq data. But most of the information for this software are base on MACS1.4 or lower.
So, I just do a small test for it, and it runs very fast, I have enough time and storage to try it. So, the question is : Is there any criteria for choosing proper parameters for MASC2,
ARGUMENTS LIST:
name = Xu_MUT_rep1
format = SAM
ChIP-seq file = ['SRR1042593.sam']
control file = ['SRR1042594.sam']
effective genome size = 2.70e+09
band width = 300
model fold = [5, 50]
qvalue cutoff = 5.00e-02
Larger dataset will be scaled towards smaller dataset.
Range for calculating regional lambda is: 1000 bps and 10000 bps
Broad region calling is off
Paired-End mode is off
For the code below, I got less than 100 peaks, that I don't know what's wrong with the parameters.
nohup time ~/.local/bin/macs2 callpeak -c SRR1042594.sorted.bam -t SRR1042593.sorted.bam -f BAM -B -g hs -n Xu_MUT_rep1 2>Xu_MUT_rep1.masc2.log &
nohup time ~/.local/bin/macs2 callpeak -c SRR1042596.sorted.bam -t SRR1042595.sorted.bam -f BAM -B -g hs -n Xu_MUT_rep2 2>Xu_MUT_rep2.masc2.log &
nohup time ~/.local/bin/macs2 callpeak -c SRR1042598.sorted.bam -t SRR1042597.sorted.bam -f BAM -B -g hs -n Xu_WT_rep1 2>Xu_WT_rep1.masc2.log &
nohup time ~/.local/bin/macs2 callpeak -c SRR1042600.sorted.bam -t SRR1042599.sorted.bam -f BAM -B -g hs -n Xu_WT_rep2 2>Xu_WT_rep2.masc2.log &
should I do some filter for the alignment files ?
## cat >run_bowtie2.sh
ls *.fastq | while read id ;
do
echo $id
~/biosoft/bowtie/bowtie2-2.2.9/bowtie2 -p 8 -x ~/biosoft/bowtie/hg19_index/hg19 -U $id -S ${id%%.*}.sam 2>${id%%.*}.align.log;
samtools view -bhS -q 30 ${id%%.*}.sam > ${id%%.*}.bam
samtools sort ${id%%.*}.bam ${id%%.*}.sort ## prefix for the output
samtools index ${id%%.*}.sorted.bam
done
I don't know whether I should filter out the alignment quality less than 30 , I just do it.
And I don't know whether I should remove PCR duplicated reads or not !
Then I change the p value criteria just like :
Then I can get more than 10,000 peaks , I know the criteria is too broad. But I just don't know how to deal it.