Question

how to keep column 6 (normalized tag count) in peaks.txt file called by Homer callpeaks after pos2bed manipulation?

1

Entering edit mode

7.2 years ago

Ming Lu ▴ 30

I use HOMER to call peaks getting peaks.txt file. Then I use pos2bed.pl to transform peaks.txt to peaks.bed However, the column 6 loss after the transform, which showed the normalized tag count (equal to RPKM reflecting peak density).

ChIP-Seq • 3.3k views

ADD COMMENT • link updated 7.2 years ago by Prakash ★ 2.2k • written 7.2 years ago by Ming Lu ▴ 30

score 4 · Answer 1 · 2017-10-20

4

Entering edit mode

7.2 years ago

Prakash ★ 2.2k

simple "grep" and "awk" can do your job.

grep -v "#" peak.txt |cut -f 1,2,3,4,6 | awk '{print $2"\t"$3"\t"$4"\t"$1"\t"$5}' >peak.bed

ADD COMMENT • link 7.2 years ago by Prakash ★ 2.2k

0

Entering edit mode

Thank you, I use these code, and the column 6 will be kept after bedtools intersect

cut -f 1,2,3,4,6 peaks.txt | awk '{print $2"\t"$3"\t"$4"\t"$1"\t"$5}' >peak1.bed
pos2bed.pl peaks.txt > peak2.bed
awk 'NR==FNR {h[$4] = $5; next} {print $1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6"\t"h[$4]}' peak1.bed peak2.bed >peaks.bed
chr11   117467921   117468098   chr11-2 1   +   86.8
chr17   39636555    39636732    chr17-2 1   +   85.6
chr2    231281278   231281455   chr2-2  1   +   83.3

but still 1 questions: "#" mean any pattern I can input, is that right? I didnot use it

ADD REPLY • link 7.2 years ago by Ming Lu ▴ 30

1

Entering edit mode

but still 1 questions: "#" mean any pattern I can input, is that right? I didnot use it

yes, within double quote, you can use any pattern. in this case, line with comment in peak file i.e "#" is not required, so to filter it, "grep -v "#" has been used.

ADD REPLY • link 7.2 years ago by Prakash ★ 2.2k

0

Entering edit mode

why we have to clear lines with #, which didnot impact the intersect manipulation and result? even in homer's pos2bed.pl .txt >.bed, the new .bed file keeps the lines with #

ADD REPLY • link 7.1 years ago by Ming Lu ▴ 30

1

Entering edit mode

you can further shorten the code:

cut -f 1-6 peaks.txt | awk '{print $2,$3,$4,$1,$5}' OFS="\t"

cut will take range and awk can take delimiter to all columns. IMO, that much code is not necessary. Please try the following:

OP:

cut -f 1,2,3,4,6 peaks.txt | awk '{print $2"\t"$3"\t"$4"\t"$1"\t"$5}' >peak1.bed

New code if you have lines with #:

grep -v "#" peak.txt |cut -f 2-4,1,6 > peak1.bed

New code if you do not have lines with #:

cut -f 2-4,1,6 peak.txt > peak1.bed

ADD REPLY • link 7.2 years ago by cpad0112 21k

1

Entering edit mode

grep -v "#" peak.txt |cut -f 2-4,1,6 > peak1.bed

Actually, using this code, order of column will not be changed. So, yes below shorter code which you mentioned will solve the purpose.

cut -f 1-6 peaks.txt | awk '{print $2,$3,$4,$1,$5}' OFS="\t"

ADD REPLY • link 7.2 years ago by Prakash ★ 2.2k

1

Entering edit mode

oops...I didn't see 5th column missing.

$ cut -f 1-6 peaks.txt | awk '{print $2,$3,$4,$1,$5}' OFS="\t"

should be

$ cut -f 1-4, 6 peaks.txt | awk '{print $2,$3,$4,$1,$5}' OFS="\t"

ADD REPLY • link 7.2 years ago by cpad0112 21k