Error for read density visualization
3
0
Entering edit mode
8.6 years ago
jolin0701-dy ▴ 100

I followed the pipeline of ChIP-seq analysis for the paper in Nature Protocol.

In this pipeline, it create density file by extend reads, calculate read density at each position and normalize the library size to 1 million reads.

Here is the source code.

#!/bin/bash                                                                                                                                                              

for sample in 1_Input 3_IP;

do

EXTEND=150

#Number of reads                                                                                                                                                         
librarySize=$(samtools idxstats ${sample}.bam | awk '{total+=$3}END{print total}')

#Create density file:extend reads, calculate read density at each position and normalize the library size to 1 million reads                                             
bamToBed -i ${sample}.bam | awk -v CHROM="mm10.chrom.sizes" -v EXTEND=$EXTEND -v OFS='\t' 'BEGIN {while(getline>CHROM){chromSize[$1]=$2}}{chrom=$1;start=$2;end=$3;strand=$6;if(strand=="+"){end=start+EXTEND;if(end>chromSize[chrom]){end=chromSize[chrom]}};if(strand=="-"){start=end-EXTEND;if(start>1){start=1}};print chrom,start,end}'|sort -k1,1 -k2,2n | genomeCoverageBed -i stdin -g mm10.chrom.sizes -d | awk -v OFS='\t' -v SIZE=$librarySize '{print $1,$2,$2+1,$3*1000000/SIZE}' | gzip > ${sample}.density.gz

done;

Then I got an error:

It looks as though you have less than 3 columns at line: 625219.  Are you sure your files are tab-delimited?

Any ideas on how to fix it?

I also tried

genomeCoverageBed -ibam stdin -g mm10.chrom.sizes -d

It still does not work.

Thanks….

ChIP-Seq • 2.2k views
ADD COMMENT
1
Entering edit mode
8.6 years ago
Fidel ★ 2.0k

Did you look at line 625219 of your file? You can try something like:

cat -b <file> | head -625219 | tail

However, I would suggest you not reinvent the wheel. We have spend quite some time polishing tools that do exactly that. deepTools can extend reads and normalize using different methods. See the bamCoverage and bamCompare tools.

ADD COMMENT
0
Entering edit mode
7.2 years ago
imtheman69 • 0

Hi Jolin, Did you mange to sort this out? I would like to use this pipeline, and I get the same problem. deepTools is great, but I am doing this as a coding exercise.

Cheers imtheman

ADD COMMENT
0
Entering edit mode
7.2 years ago
imtheman69 • 0

bamToBed -i ${sample}.bam | awk -v CHROM="mm10.chrom.sizes" -v EXTEND=$EXTEND -v OFS='\t' 'BEGIN {while(getline>CHROM){chromSize[$1]=$2}}{chrom=$1;start=$2;end=$3;strand=$6;if(strand=="+"){end=start+EXTEND;if(end>chromSize[chrom]){end=chromSize[chrom]}};if(strand=="-"){start=end-EXTEND;if(start>1){start=1}};print chrom,start,end}'|sort -k1,1 -k2,2n | genomeCoverageBed -i stdin -g mm10.chrom.sizes -d | awk -v OFS='\t' -v SIZE=$librarySize '{print $1,$2,$2+1,$3*1000000/SIZE}' | gzip > ${sample}.density.gz

#

Change from getline greater than to getline less than

#

Change from if(start> to if(start<

It worked fine after that.

ADD COMMENT

Login before adding your answer.

Traffic: 2382 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6