Bedtools Compare Multiple Bed Files to one Bed files?
2
1
Entering edit mode
10.0 years ago
BIOTIN ▴ 50

I've been dealing with comparison between 40 bed files to one bed file using intersectBed -a -b command. I'm just wondering, is there any commands in Bedtools which can help us compare multiple bed files?

Say, I have 40 bed files and one particular bed file. I want to identify those regions in the 40 bed files overlaps with the particular one. I mean the 40 to 1 comparison.

Is there any fast ways to compare them and do not need type the code one by one like intersectBed -a -b, intersectBed -a -c, intersectBed -a -d, intersectBed -a -e...

sequencing alignment genome • 6.8k views
ADD COMMENT
1
Entering edit mode
10.0 years ago

One fast option is to use BEDOPS:

$ bedops --intersect A.bed B.bed C.bed ... > answer.bed

You can use lots of inputs efficiently. The input files just have to be sorted.

The above command intersects B.bed C.bed etc. with A.bed, reporting all elements of A.bed that overlap B.bed C.bed etc.

Let's say you want to go the other direction efficiently. You can use BEDOPS with UNIX pipes and redirect standard output from one command to the next:

$ bedops --everything B.bed C.bed ... | bedops --intersect - A.bed > answer.bed

This does a multiset union of all the elements in B.bed C.bed etc. and passes these to an --intersect operation with A.bed.

The result file reports all elements of B.bed C.bed etc., which overlap A.bed.

The difference between these two directions is in which sets of elements get reported in the overlap. In the first case, elements of A.bed are reported. In the second case, elements of B.bed C.bed etc. are reported. Generally, this is not a symmetric operation.

If you have a lot of files to sort, a quick bash one-liner can take care of this:

$ for fn in `ls *.bed`; do sort-bed ${fn} > ${fn%.*}.sorted.bed; done

Some use GNU sort to do sorting of BED files, but BEDOPS sort-bed is usually faster.

ADD COMMENT
0
Entering edit mode
10.0 years ago

latest bedtools version allows using wildcards, so finding the overlapping regions of all that 40 bed files with that particular 1 bed file would be as simple as:

bedtools intersect -a particular.bed -b all40*bed > all.40vs1.compared.bed

if you want to get the comparison for each one of the 40 with that particular 1, then you'll definitely have to loop:

for file in all40*bed; do
  bedtools intersect -a particular.bed -b $file > compared.$file
done
ADD COMMENT
0
Entering edit mode

Using parallel:

parallel bedtools intersect -a particular.bed -b {} > compared.{} ::: all40*bed
ADD REPLY

Login before adding your answer.

Traffic: 1722 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6