I followed the pipeline of ChIP-seq analysis for the paper in Nature Protocol.
In this pipeline, it create density file by extend reads, calculate read density at each position and normalize the library size to 1 million reads.
Here is the source code.
#!/bin/bash
for sample in 1_Input 3_IP;
do
EXTEND=150
#Number of reads
librarySize=$(samtools idxstats ${sample}.bam | awk '{total+=$3}END{print total}')
#Create density file:extend reads, calculate read density at each position and normalize the library size to 1 million reads
bamToBed -i ${sample}.bam | awk -v CHROM="mm10.chrom.sizes" -v EXTEND=$EXTEND -v OFS='\t' 'BEGIN {while(getline>CHROM){chromSize[$1]=$2}}{chrom=$1;start=$2;end=$3;strand=$6;if(strand=="+"){end=start+EXTEND;if(end>chromSize[chrom]){end=chromSize[chrom]}};if(strand=="-"){start=end-EXTEND;if(start>1){start=1}};print chrom,start,end}'|sort -k1,1 -k2,2n | genomeCoverageBed -i stdin -g mm10.chrom.sizes -d | awk -v OFS='\t' -v SIZE=$librarySize '{print $1,$2,$2+1,$3*1000000/SIZE}' | gzip > ${sample}.density.gz
done;
Then I got an error:
It looks as though you have less than 3 columns at line: 625219. Are you sure your files are tab-delimited?
Any ideas on how to fix it?
I also tried
genomeCoverageBed -ibam stdin -g mm10.chrom.sizes -d
It still does not work.
Thanks….