I have ~40k coordinates in bed file and I want to identify summit coordinate for each of them. I have mapped file with ~8 million reads mapped, also in bed file. I found Identifying The 'Summit' Coordinate Within Many Coordinates In A Wig File? post but one of the solution is with the Python and it was asked 2 years ago. Does anybody know a tool (is it possible with bedtools or BEDOPS?) which can solve this problem quickly?
It turns out macs could be the answer here. The newest version of macs, macs2 has an option called refinepeaks. This is a functionality independent from the peak caller.
This is the description: "(Experimental) Take raw reads alignment, refine peak summits and give scores measuring balance of forward- backward tags"
The way it refines the summits is to look at tags on the forward reverse strands similar to the way SPP works. So, it's not a simple algorithm that locates the maximum value and spits that out. This method of comparing the forward and reverse strands works best for transcription factors.
ADD COMMENT
• link
updated 4.9 years ago by
Ram
44k
•
written 11.4 years ago by
KCC
★
4.1k
0
Entering edit mode
Thanks for the reply. I think it refines somehow for the chip seq data. But my file has nothing to do with the chip seq.
I agree it's designed for ChIP-seq, in that it assumes that there will be patterns in the reads that are assigned to the forward and reverse strands. I think it should work for sequencing data with punctate peaks (so sharp peaks) where the tags have equal probability of occurring on both strands.
Thanks for the reply. I think it refines somehow for the chip seq data. But my file has nothing to do with the chip seq.
I agree it's designed for ChIP-seq, in that it assumes that there will be patterns in the reads that are assigned to the forward and reverse strands. I think it should work for sequencing data with punctate peaks (so sharp peaks) where the tags have equal probability of occurring on both strands.