how to keep column 6 (normalized tag count) in peaks.txt file called by Homer callpeaks after pos2bed manipulation?
2
1
Entering edit mode
7.2 years ago
Ming Lu ▴ 30

I use HOMER to call peaks getting peaks.txt file. Then I use pos2bed.pl to transform peaks.txt to peaks.bed However, the column 6 loss after the transform, which showed the normalized tag count (equal to RPKM reflecting peak density).

ChIP-Seq • 3.3k views
ADD COMMENT
4
Entering edit mode
7.2 years ago
Prakash ★ 2.2k

simple "grep" and "awk" can do your job.

grep -v "#" peak.txt |cut -f 1,2,3,4,6 | awk '{print $2"\t"$3"\t"$4"\t"$1"\t"$5}' >peak.bed

ADD COMMENT
0
Entering edit mode

Thank you, I use these code, and the column 6 will be kept after bedtools intersect

cut -f 1,2,3,4,6 peaks.txt | awk '{print $2"\t"$3"\t"$4"\t"$1"\t"$5}' >peak1.bed
pos2bed.pl peaks.txt > peak2.bed
awk 'NR==FNR {h[$4] = $5; next} {print $1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6"\t"h[$4]}' peak1.bed peak2.bed >peaks.bed
chr11   117467921   117468098   chr11-2 1   +   86.8
chr17   39636555    39636732    chr17-2 1   +   85.6
chr2    231281278   231281455   chr2-2  1   +   83.3

but still 1 questions: "#" mean any pattern I can input, is that right? I didnot use it

ADD REPLY
1
Entering edit mode

but still 1 questions: "#" mean any pattern I can input, is that right? I didnot use it

yes, within double quote, you can use any pattern. in this case, line with comment in peak file i.e "#" is not required, so to filter it, "grep -v "#" has been used.

ADD REPLY
0
Entering edit mode

why we have to clear lines with #, which didnot impact the intersect manipulation and result? even in homer's pos2bed.pl .txt >.bed, the new .bed file keeps the lines with #

ADD REPLY
1
Entering edit mode

you can further shorten the code:

cut -f 1-6 peaks.txt | awk '{print $2,$3,$4,$1,$5}' OFS="\t"

cut will take range and awk can take delimiter to all columns. IMO, that much code is not necessary. Please try the following:

OP:

cut -f 1,2,3,4,6 peaks.txt | awk '{print $2"\t"$3"\t"$4"\t"$1"\t"$5}' >peak1.bed

New code if you have lines with #:

grep -v "#" peak.txt |cut -f 2-4,1,6 > peak1.bed

New code if you do not have lines with #:

cut -f 2-4,1,6 peak.txt > peak1.bed
ADD REPLY
1
Entering edit mode

grep -v "#" peak.txt |cut -f 2-4,1,6 > peak1.bed

Actually, using this code, order of column will not be changed. So, yes below shorter code which you mentioned will solve the purpose.

cut -f 1-6 peaks.txt | awk '{print $2,$3,$4,$1,$5}' OFS="\t"
ADD REPLY
1
Entering edit mode

oops...I didn't see 5th column missing.

$ cut -f 1-6 peaks.txt | awk '{print $2,$3,$4,$1,$5}' OFS="\t"

should be

$ cut -f 1-4, 6 peaks.txt | awk '{print $2,$3,$4,$1,$5}' OFS="\t"
ADD REPLY

Login before adding your answer.

Traffic: 2231 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6