HI,
I have two bed file like follows:
bed_A
scaffold11296 36365 36414
scaffold11296 36471 36526
bed_B
scaffold11296 36302 36334 -
scaffold11296 36303 36334 +
scaffold11296 36339 36370 +
scaffold11296 36340 36369 +
scaffold11296 36365 36395 -
scaffold11296 36366 36394 -
scaffold11296 36367 36395 -
scaffold11296 36368 36395 -
scaffold11296 36394 36414 -
scaffold11296 36471 36502 +
scaffold11296 36483 36516 +
scaffold11296 36495 36526 +
scaffold11296 40892 40909 +
scaffold11296 40892 40909 +
scaffold11296 40892 40909 +
scaffold11296 40892 40909 +
I would like to count "+" orientated reads, if any, between the ranges specified in file "bed_A" file. I expect a output like follows:
scaffold11296 36365 36414 0
scaffold11296 36471 36526 3
Any guidance would be really useful. thanks in advance
One way, via
bedmap
andbedops
:Another way, via a
bash
script that just usesbedops
:The first approach via
bedmap
will likely run somewhat faster. It is also (to my eyes) a lot more readable.On the other hand, the second approach that works on each line of the reference file (
A.bed
) gives you more control. With this approach, for instance, you could count overlaps of both forward- and reverse-stranded elements in one pass (as noted in the answer to your forum post):Yet another UNIX-based way to do this would be:
But you example output does not seems correct:
scaffold11296 36365 36414
overlaps with two+
regions from file B.