Is there any tool that can group the reads in a bed file format that are few bases close to each other.
For instance,
Chr1 1023080 1023114 XYZ 806 +
Chr1 1023081 1023115 XYZ 50 +
Chr1 1023083 1023117 ABC 3 +
Chr1 1023085 1023119 cbd 90 +
I would like to group if the reads are atleast 4 bases close to each other... and report the results as follows
Chr1 1023080 1023117 xyz 859 +
Chr1 1023085 1023119 cbd 90 +
Is it possible using bedtools of awk in a short script? The name column doesnt really matter.. Any directions would be most welcome.. Thanks!!
I don't follow your example. 1) why is the second record in your output distinct from the first? 2) why are the coordinates of the first output record not
Chr1:1023080-1023117
Yes, the first record should be 1023117... that was a typo mistake! The last record in fle is 4 bases away, I would like to group only the reads that fall within four bases range...
Hope I answered your question!