Entering edit mode
6.5 years ago
endrebak
▴
970
In bedtools intersect means "for each read in A, for each read in B that overlaps, find the part of A that overlaps with a read in B".
So that if you have 4 reads in A and 6 reads in B and all overlap you get 4*6 results.
GenomicRanges intersect works differently; for both A and B it first clusters overlapping reads into one read and then does the intersection operation. Is it possible to get GenomicRanges intersect to work like bedtools?
Here is the input with the expected output:
head tests/f2.bed tests/f3.bed
==> tests/f2.bed <==
chr1 1 2 f 0 +
chr1 6 7 f 0 -
==> tests/f3.bed <==
chr1 3 6 h 0 +
chr1 4 7 h 0 -
chr1 5 7 h 0 -
chr1 8 9 h 0 +
biocore-home ~/c/pyranges (master DU=) bedtools intersect -a tests/f3.bed -b tests/f2.bed
chr1 6 7 h 0 -
chr1 6 7 h 0 -
Try
findOverlaps
. You can then useGenomicRanges::reduce
on a 'combined' GRanges object made from the combined set of reads (usingc()
works for that IIRC). With the optionrevmap=TRUE
, you can backtrack to find which rows of the combined GRanges correspond to that region, and so that particular overlap.If you set up a reprex I can show you how this might work.
Okay, thanks. I am not that interested in the actual way of doing it, I was just wondering if it was possible without hand rolling a solution. It seems like most genomicranges operations are set-like.
Great, just submitted an answer to your query
Dear, I am sorry to reopen this question, I want to run bedtools intersect like operation on genomicranges in R, but I am actually not clear about the process you mentioned. The input and output I excepted is similar with this question. Could you please give me more information? Best. Zhang
Yes, it is possible.