Entering edit mode
2.3 years ago
Dan
▴
180
I annotated the peaks using annotatePeaks.pl macs2/${FILE}"_peaks.narrowPeak" mm10
,
The annotation file is :
PeakID Chr Start End Strand Peak Score Focus Ratio/Region Size Annotation Detailed Annotation Distance to TSS Nearest PromoterID Entrez ID Nearest Unigene Nearest Refseq Nearest Ensembl Gene Name Gene Alias Gene Description Gene Type
Sample_A_peak_38484 chr6 47743726 47745106 + 5689 NA promoter-TSS (NR_002841).2 promoter-TSS (NR_002841).2 -185 NR_002841 19799 NR_002841 Rn4.5s - 4.5S RNA rRNA
How can I select the promoter-TSS
, promoter
, orTSS
peaks from the .narrowPeak
file based on the Annotation
column of the annotation file?
Thanks a lot.
The
'$8 == "promoter-TSS"'
is in the annotation file, but I want to filter thenarrowPeak
file, how can I do that? ThanksIf your narrowPeak file has peak IDs in the NAME field you can
awk '$8 == "promoter-TSS"' x.annot | cut -f1 > peak_names.txt'
grep -Wf peak_names.txt x.narrowPeak > x.promoterTSS.narrowPeak
If your narrowPeak file has only positions you will need to create loc strings:
awk '$8 == "promoter-TSS"' x.annot | awk '{print $2":"$3"-"$4}' > peak_locs.txt
awk '{print $0"\t"$1":"$2"-"$3}' x.narrowPeak | grep -Wf peak_locs.txt > x.promoterTSS.narrowPeak
This will have an extra column with the location string, but you can
cut
it out if you need.