I like to find overlap regions (even with 1 bp) along with the number of counts (fragments) which have overlaps together.
Input: A.bed
chr1 10 20
chr1 30 50
chr1 35 43
chr1 38 49
chr1 40 50
chr10 15 20
chr10 18 25
chr10 25 30
chr2 20 60
chr2 58 70
chr2 59 75
I have tried a way. First I merged them using bedtools merge
and I got the blowing table with the number of regions that have merged.
output1 : column 4 refers to the number of involved fragments in merged coordinates
chr1 10 20 1
chr1 30 50 4
chr10 15 30 3
chr2 20 75 3
Then, using genomeCoverageBed -i A.bed -g hg19.main.chroms.sizes -bga
I could get my result but it is among several values:
output2:
chr1 0 10 0
chr1 10 20 1
chr1 20 30 0
chr1 30 35 1
chr1 35 38 2
chr1 38 40 3
chr1 40 43 4
chr1 43 49 3
chr1 49 50 2
chr1 50 249250621 0
chr10 0 15 0
chr10 15 18 1
chr10 18 20 2
chr10 20 30 1
chr10 30 135534747 0
chr2 0 20 0
chr2 20 58 1
chr2 58 59 2
chr2 59 60 3
chr2 60 70 2
chr2 70 75 1
I like to know how can I link two table together, as the coordinate in output2
should be part of output1
, then filter based on column 4 in both data table?
Final output:
chr1 10 20 1 chr1 10 20 1
chr1 30 50 4 chr1 40 43 4
chr2 20 70 3 chr2 59 60 3