I have a fragments file ( fragments_chr.bed ) like this :
chr1 0 7506
chr1 7512 10943
chr1 10949 13169
chr1 13175 20064
chr1 20070 28272
chr1 28278 29959
chr1 29965 36598
chr1 36604 37512
chr1 37518 40359
chr1 40365 48795
chr1 48801 50660
chr1 50666 52359
chr1 52365 52850
chr1 52856 54447
chr1 54453 54521
chr1 54527 54816
chr1 54822 64009
chr1 64015 66453
chr1 66459 66724
chr1 66730 66943
chr1 66949 70664
chr1 70670 74343
chr1 74349 77084
chr1 77090 83123
chr1 83129 87901
chr1 87907 90967
and the results of my analysis ( counts.bed ) in counts like this
chr1 7506 7512 346
chr1 37537 37543 458
chr1 77108 77115 844
chr1 83121 83129 1629
chr1 83148 83155 1779
I would like to create a new file in order to collect all read counts into my fragments. By using bedtools intersect i ve noticed that i am losing some counts mainly those which are at the begging fragment of each chromosome. When i am running bedtools intersect like this
bedtools intersect -a fragments_chr.bed -b counts.bed -wa -wb > intersect1_1_w4cseq_test.txt
I have this result :
chr1 37518 40359 chr1 37537 37543 458
chr1 77090 83123 chr1 77108 77115 844
chr1 77090 83123 chr1 83121 83129 1629
chr1 83129 87901 chr1 83148 83155 1779
In that case the count in chr1 7506 7512 346
is missing.
Any suggestion how to solve it or any other way to do it ?
Thank you
Dimitris
YEs sorry for the mistake i forgot the -wa -wb i will add it . Ok how can we do it in order to appear ? Because even if this doesnt overlap we have 2 fragments in the fragments_chr.bed file with thoose coordinates.
and it should be somewhere right ? Otherwise we are loosing counts ..Do you know any other way apart from intersect in order to add my counts in those ranges of chr positions ?
bed files are zero based.@ OP
I see no way how this should work. As Yujin Kim says this two fragments have no overlap with your
counts.bed
.Take care that the bed file uses half-open intervals, which means that the start position (column 2) is included but the end position (column 3) is not.
Let's have a look at the first line of
fragments_chr.bed
:This means, the fragment includes the position 0 - 7505, but not 7506.
And the first line of
counts.bed
:This means the region start at 7506 (and so 1 position after your fragment) and includes all positions up to 7511.
fin swimmer
So this count doesnt belong anywhere ? it cant go either to chr1 7512 10943 or to chr1 0 7506. In this way we can not see with intersect the numbers of reads per fragment. Or the numbers of reads per fragment are only those which means some of your mapped reads are not included in the fragment.
in case of doubt, add
-loj
option to bedtools function above. It would print all the positions with and without overlap.