Question

What is the difference between using "-f 0.5 -r" and "-f 0.5 -F 0.5" in bedtools intersect?

0

Entering edit mode

8.5 years ago

tako • 0

I am analyzing two different datasets using bedtools intersect, and would like to see how closely they match by running bedtools intersect. From the documentation provided online, I can see that the "-r" option would filter for reciprocal overlaps of any given fraction using "-f". To me, this sounds like running both "-f" and "-F" at the same time. However, I get vastly different results when running "-f 0.5 -r" and running "-f 0.5 -F 0.5". Can anybody explain the difference?

To give more details, the option "-f 0.5 -r" gives thousands of results, while "-f 0.5 -F 0.5" only gives a dozen or so results.

bedtools intersect • 2.1k views

ADD COMMENT • link updated 20 months ago by Ram 44k • written 8.5 years ago by tako • 0

0

Entering edit mode

-f 0.5 -r is same as -f 0.5 -F 0.5 but if you want to specify a different overlap threshold from two features, you could now use -f 0.5 -F 0.7 which was not possible with -r.

-r just uses the same fraction of overlap as that of -f for file B, but now you could specify a different reciprocal fractions using -F

Hope I did not confuse you more.

If you see a difference, it would be good if you post an example, helps to troubleshoot.

ADD REPLY • link 8.5 years ago by GouthamAtla 12k

0

Entering edit mode

Added more info. The results are not the same.

ADD REPLY • link 8.5 years ago by tako • 0

0

Entering edit mode

you need to post an example. May be the lines that differ from -f -r Vs -f -F

ADD REPLY • link 8.5 years ago by GouthamAtla 12k

0

Entering edit mode

Mmmm.. I wouldn't risk my neck for what I am going to say but, are you getting more overlaps when you specify -f 0.5 -F 0.5 ? One reason that springs into my mind is that -f 0.5 -F 0.5 arguments is set, maybe A fulfils the condition, but B does not, and the record is printed. While if "-f 0.5 -r" is set, both A and B must met the condition to be printed.

Anyway, the best way to address the problem is to create a test data and try to check it yourself :)