#

Question

Error for read density visualization

0

Entering edit mode

9.2 years ago

jolin0701-dy ▴ 100

I followed the pipeline of ChIP-seq analysis for the paper in Nature Protocol.

In this pipeline, it create density file by extend reads, calculate read density at each position and normalize the library size to 1 million reads.

Here is the source code.

#!/bin/bash                                                                                                                                                              

for sample in 1_Input 3_IP;

do

EXTEND=150

#Number of reads                                                                                                                                                         
librarySize=$(samtools idxstats ${sample}.bam | awk '{total+=$3}END{print total}')

#Create density file:extend reads, calculate read density at each position and normalize the library size to 1 million reads                                             
bamToBed -i ${sample}.bam | awk -v CHROM="mm10.chrom.sizes" -v EXTEND=$EXTEND -v OFS='\t' 'BEGIN {while(getline>CHROM){chromSize[$1]=$2}}{chrom=$1;start=$2;end=$3;strand=$6;if(strand=="+"){end=start+EXTEND;if(end>chromSize[chrom]){end=chromSize[chrom]}};if(strand=="-"){start=end-EXTEND;if(start>1){start=1}};print chrom,start,end}'|sort -k1,1 -k2,2n | genomeCoverageBed -i stdin -g mm10.chrom.sizes -d | awk -v OFS='\t' -v SIZE=$librarySize '{print $1,$2,$2+1,$3*1000000/SIZE}' | gzip > ${sample}.density.gz

done;

Then I got an error:

It looks as though you have less than 3 columns at line: 625219.  Are you sure your files are tab-delimited?

Any ideas on how to fix it?

I also tried

genomeCoverageBed -ibam stdin -g mm10.chrom.sizes -d

It still does not work.

Thanks….

ChIP-Seq • 2.4k views

ADD COMMENT • link updated 7.8 years ago by imtheman69 • 0 • written 9.2 years ago by jolin0701-dy ▴ 100

score 1 · Answer 1 · 2016-05-02

Did you look at line 625219 of your file? You can try something like:

cat -b <file> | head -625219 | tail

However, I would suggest you not reinvent the wheel. We have spend quite some time polishing tools that do exactly that. deepTools can extend reads and normalize using different methods. See the bamCoverage and bamCompare tools.

score 0 · Answer 2 · 2017-10-02

0

Entering edit mode

7.8 years ago

imtheman69 • 0

Hi Jolin, Did you mange to sort this out? I would like to use this pipeline, and I get the same problem. deepTools is great, but I am doing this as a coding exercise.

Cheers imtheman

ADD COMMENT • link 7.8 years ago by imtheman69 • 0

score 0 · Answer 3 · 2017-10-05

bamToBed -i ${sample}.bam | awk -v CHROM="mm10.chrom.sizes" -v EXTEND=$EXTEND -v OFS='\t' 'BEGIN {while(getline>CHROM){chromSize[$1]=$2}}{chrom=$1;start=$2;end=$3;strand=$6;if(strand=="+"){end=start+EXTEND;if(end>chromSize[chrom]){end=chromSize[chrom]}};if(strand=="-"){start=end-EXTEND;if(start>1){start=1}};print chrom,start,end}'|sort -k1,1 -k2,2n | genomeCoverageBed -i stdin -g mm10.chrom.sizes -d | awk -v OFS='\t' -v SIZE=$librarySize '{print $1,$2,$2+1,$3*1000000/SIZE}' | gzip > ${sample}.density.gz

#

Change from getline greater than to getline less than

#

Change from if(start> to if(start<

It worked fine after that.