Bedtools merge but only if intersection exists
2
1
Entering edit mode
24 months ago
jrose ▴ 30

I want to instead of use the -wa option or -wb option in bedtools intersect, i want the union of all regions which have some intersection.

So if my input is like so (first row bed 1 second row bed 2):

ccccc        ccccc   cc   ccccccccc
   cccc                 ccccccccccccccccccc

then I want the output to be:

ccccccc                 ccccccccccccccccccc

Is there an operation like this?

I can do bedtools intersect -wa and then -wb on the same input and now I have to files only of regions that contain an intersection, and then take their union, but i was wondering if there was something already implemented.

bed bedops bedtools • 1.6k views
ADD COMMENT
0
Entering edit mode

It's as you say just a pipe of two commands, I do not see the downside of that:

bedtools intersect -a first.bed -b second.bed -wa -wb | bedtools merge -i -

See answer.

ADD REPLY
0
Entering edit mode

ah ok. I did not know you could do -wa -wb. Thanks

ADD REPLY
0
Entering edit mode

requires some awk, see below

ADD REPLY
0
Entering edit mode

Thank you all

ADD REPLY
1
Entering edit mode
24 months ago
ATpoint 85k

Assuming a three-column BED file you could do this, with an awk command as -wa -wb will make a 6-column BED file after the intersect:

% cat first.bed 
chr1    1   5
chr1    10  20
chr1    30  40

% cat second.bed 
chr1    4   8
chr1    15  18
chr1    100 200

% bedtools intersect -a first.bed -b second.bed -wa -wb                                                    
chr1    1   5   chr1    4   8
chr1    10  20  chr1    15  18

% bedtools intersect -a first.bed -b second.bed -wa -wb | awk 'OFS="\t" {print $1, $2, $3"\n"$4, $5, $6}'                                                 
chr1    1   5
chr1    4   8
chr1    10  20
chr1    15  18

# and finally all in one plus merge:
% bedtools intersect -a first.bed -b second.bed -wa -wb | awk 'OFS="\t" {print $1, $2, $3"\n"$4, $5, $6}' | sort -k1,1 -k2,2n | bedtools merge -i -
chr1    1   8
chr1    10  20

BEDtools version 2.30.0

ADD COMMENT
0
Entering edit mode

Actually my bedtools v2.27.1 returns a 3 column bed file with -wa -wb

ADD REPLY
0
Entering edit mode

I am on 2.30.0.

ADD REPLY
1
Entering edit mode
24 months ago

Get all elements of first that have one or more overlaps with elements in second, and vice versa, and then merge those two sets:

% bedops --merge <(bedops -e 1 first.bed second.bed) <(bedops -e 1 second.bed first.bed)
chr1    1   8
chr1    10  20

Each of these is a process substitution that calculates overlaps:

<(bedops -e 1 first.bed second.bed)

and:

<(bedops -e 1 second.bed first.bed)

You can use process substitutions in place of files with BEDOPS tools.

ADD COMMENT

Login before adding your answer.

Traffic: 1664 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6