I have a somewhat tricky BED file convergence that I need to produce, well tricky for me anyway.
The first set of intervals looks as follows:
chr1 5022835 5023172 sorted_merged_peak_1 867
chr1 7088778 7088978 sorted_merged_peak_2 1041
chr1 36710020 36710149 sorted_merged_peak_3 805
The second set of intervals contains single bp stretches that fall within the boundaries of those in the first set:
chr1 5022838 5022839
chr1 7088780 7088781
chr1 36710022 36710023
The output I am looking for (4 column) will contain the first set of intervals as columns 1-3 with the single bp foci falling in the fourth column next to their appropriate interval from set 1. For some of the intervals in the first set there are multiple single bp foci in the second set and I would like that represented in the final 4 column interval. Not sure if this is entirely clear, let me know if there is a way to do this. Thanks!
So you want to let the first 3 columns representing the intervals in set 1, then show the foci from set 2 located in the interval in column 4? How do you want to show them, can you give an example output?
Yes thats exactly it, I want the foci as a single bp location added to the fourth column of set 1 specifically next those intervals they fall within.
so what do you mean by this "For some of the intervals in the first set there are multiple single bp foci in the second set and I would like that represented in the final 4 column interval." ?