Entering edit mode
6.0 years ago
ste.lu
▴
80
Hi All,
I have two tables so structured:
Table1:
chr1 1 16007 8899
chr1 16008 24571 294334
chr1 24572 27981 192966
chr1 27982 30429 708562
chr1 30430 32153 610786
chr1 32154 32774 837824
chr1 32775 37752 177990
chr1 37753 38369 9374
chr1 38370 38791 367385
chr1 38792 39255 552088
...
Table 2:
chr1 1 111 13177 2005
chr1 112 223 9221 7796
chr1 224 340 2625 5833
chr1 341 461 14128 1124
chr1 462 580 3211 12051
chr1 581 701 9551 1563
chr1 702 825 10751 6891
chr1 826 939 3301 11244
chr1 940 1059 8072 4022
chr1 1060 1173 380 5584
...
what I need to do is to group and merge all the data in "table 2" to in the categories of "table 1". The relevant columns are 1:3.
if table2(x;y) is a subset of table1 (i;j), then make a table 3 where all the entry (x;y) are merged according to (i:j). For instance, in this example, all the entry of table2 are a subset of the first entry of table 1. In the end, I'd like to have:
Table3:
chr1 1 16007
Maybe you can use bedtools in R for this, like described here.
Bedtools was the easier!!
and then filter table 3 for the repetitive entries
Great solution! I recommended to do it in R because you tagged "R".
Have a look at
GenomicRanges/IRanges
and its intersection/overlapping functions. You simply have to check for overlaps, and if TRUE, then return the entry of table1. The vignettes contain plenty of examples.