I have a BED similar to this:
1 47554 1341771 2
1 1341771 16237999 2
1 29269871 29386071 1
Where the last numeric column is the copy number calculated by another program. Now, you may notice that the interrvals there (the first two lines) have the same copy number (separate segments because of a different log2R, but the call is the same in both).
I need to merge these book-endedregions, but only if the copy number is the same.
So, for the case above, the result would be:
1 47554 16237999 2
1 29269871 29386071 1
But if instead, the copy number were different between the two lines. for example:
1 47554 1341771 2
1 1341771 16237999 3
they should not get merged. As far as I can read, bedtools groupby does not do what I want, and neither does bedops.
Is such a thing possible without going crazy and do stuff manually?
I guess you can take inspiration from here basically and then try to abstract: https://bioinformatics.stackexchange.com/questions/3523/bed-file-merge-book-end-features-only-if-same-name-in-column-4
From a quick look it seems to do already what I want, I'll experiment a bit with it.