You can use BEDOPS and set operations to solve this problem for three generic files. However, it is unclear from your overlap criteria how you are getting the end coordinate of 34
. Note that the region chr1:33-34
is not common to all three sample input files, as presented, being found only in file2.bed
.
In any case, here is how you can solve this problem:
$ bedops --merge file2.bed | bedmap --echo-map file1.bed - | bedops --intersect - file3.bed
chr1 30 33
chr1 200 300
Let's break down how these three commands work together.
(1) The bedops
statement uses the --merge
operator to merge elements in file2.bed
into contiguous (non-overlapping) regions. This result is piped into the bedmap
statement.
(2) The bedmap
statement uses the --echo-map
operator to report all contiguous regions in the merged file2.bed
(the "map" file) which overlap elements in file1.bed
(the "reference" file) by one or more bases:
$ bedops --merge file2.bed > file2.merged.bed
$ bedmap --echo-map file1.bed file2.merged.bed
chr1 25 34
chr1 200 300
(The second line is blank, because there is no region in the merged file2.bed
which overlaps chr1:100-120
from file1.bed
.)
(3) This result is piped into the last bedops
statement, which uses the --intersect
operator to intersect those two non-empty regions chr1:25-34
and chr1:200-300
with regions in file3.bed
.
The final answer consists of bases that are common to file1.bed
, file2.bed
and file3.bed
.
Note: The only assumption these tools make is that all input BED files are sorted. This allows BEDOPS apps to run very fast and with a low memory profile, as compared with alternative toolkits which do not require sorted input (or which have only recently added sorting requirements after publication of BEDOPS). For your example inputs, this is not an issue. For the general case, we provide the sort-bed
application to prep the BED inputs, if the sort-states of the input BED files are unknown.
It'd be nice if you change the tag to something appropriate for your post, like bedtools, mergebed.
Can I also merge the overlapping position, say start position and end position if in range of 0-50 ???
Please use comments under answers to ask further questions, rather than posting questions as answers.
Sorry, I overlooked the "merge overlapping" part in your question. I guess Sukhdeep's reply does exactly what you require.
the above given example files are bed files with chrNo, start and end position with 3lines in each file...I did not know how to post a separate example box in this post..