Entering edit mode
4.8 years ago
rbronste
▴
420
I have the following type of bed file:
chr1 3466104 3466105 78.785015287889 0.230646383987888 0.7
chr1 3466544 3466545 82.3541791208803 0.175581730261755 0.7
chr1 4970447 4970448 91.0596827874371 0.362146513124993 0.49
chr1 4970842 4970843 45.4574216797264 0.36031912058372 0.49
chr1 4971428 4971429 84.8969412649955 0.354286786755231 0.49
chr1 5022730 5022731 47.3724470774805 0.382788612756936 0.19
chr1 5022861 5022862 68.2243267545723 0.185953039798782 0.19
chr1 5022921 5022922 94.7996405683824 0.175074529891139 0.19
chr1 5023027 5023028 81.3859081476251 0.181175853077686 0.19
chr1 5023102 5023103 95.0062597300414 0.137145856537167 0.19
I would like to be able to filter column #4 in a range say between 70-90 while only including in the final file those rows where column B values are within lets say 500bp of one another. Any ideas? Thanks!
Unix tools like awk, sed, grep are great, and someone may come up with a solution for that. But if it takes you too long to come up a solution you probably need a scripting language such as python.
What is column B? You mean the distance between the coordinate intervals? Please give a representative output example.
Column B is the start coordinate yes but I don't want the distance between B and C to be 500, I want distance of adjoining starting intervals in column B to not exceed 500 while being limited to some range in column 4.