Question

Difficult filtering scenario of BED file

0

Entering edit mode

5.4 years ago

rbronste ▴ 420

I have the following type of bed file:

chr1    3466104 3466105 78.785015287889 0.230646383987888   0.7
chr1    3466544 3466545 82.3541791208803    0.175581730261755   0.7
chr1    4970447 4970448 91.0596827874371    0.362146513124993   0.49
chr1    4970842 4970843 45.4574216797264    0.36031912058372    0.49
chr1    4971428 4971429 84.8969412649955    0.354286786755231   0.49
chr1    5022730 5022731 47.3724470774805    0.382788612756936   0.19
chr1    5022861 5022862 68.2243267545723    0.185953039798782   0.19
chr1    5022921 5022922 94.7996405683824    0.175074529891139   0.19
chr1    5023027 5023028 81.3859081476251    0.181175853077686   0.19
chr1    5023102 5023103 95.0062597300414    0.137145856537167   0.19

I would like to be able to filter column #4 in a range say between 70-90 while only including in the final file those rows where column B values are within lets say 500bp of one another. Any ideas? Thanks!

awk sed grep • 1.0k views

ADD COMMENT • link 5.4 years ago by rbronste ▴ 420

1

Entering edit mode

Unix tools like awk, sed, grep are great, and someone may come up with a solution for that. But if it takes you too long to come up a solution you probably need a scripting language such as python.

ADD REPLY • link 5.4 years ago by WouterDeCoster 48k

1

Entering edit mode

What is column B? You mean the distance between the coordinate intervals? Please give a representative output example.

ADD REPLY • link 5.4 years ago by ATpoint 88k

0

Entering edit mode

Column B is the start coordinate yes but I don't want the distance between B and C to be 500, I want distance of adjoining starting intervals in column B to not exceed 500 while being limited to some range in column 4.

ADD REPLY • link 5.4 years ago by rbronste ▴ 420