Hi everybody, Maybe this is more a programming question related, but any advice could be very handy. I'm trying to select the rows when the 4th has convergence points ( - following by +) in this data sets:
chr1 1275000 1284999 +
chr1 1285000 1294999 -
chr1 1295000 1304999 -
chr1 1385000 1394999 -
chr1 1415000 1424999 -
chr1 1425000 1434999 +
chr1 1435000 1444999 +
chr1 1715000 1724999 +
chr1 1725000 1734999 -
chr1 1735000 1744999 -
chr1 1745000 1754999 -
chr1 1795000 1804999 -
chr1 1805000 1814999 +
chr1 1815000 1824999 -
chr1 1865000 1874999 -
Expected:
chr1 1415000 1424999 -
chr1 1425000 1434999 +
chr1 1795000 1804999 -
chr1 1805000 1814999 +
I would like to use R, but I have no idea how to start with it. Any suggestion? Thanks!
It works as expected, thank you very much.
awk
is very very helpful to solve this kinds of problems, but for me takes a while to learn it...@Lila M: Make sure to check out the updated answer. I forgot to reset
pr
andps
(previous row, and previous sign) after printing. The updated version works correctly now.thank you for the update!