Hi, now I have two bed files. Each have four columuns, and the fourth column has the uniq ID for each row, each file has thousands of rows. Now I want to combine row of the two files if the rows have the same ID, the final output should like this:
chr1 10028 10029 chr14 68314662 68314663 J00118:253:HJ2FTBBXX:3:2213:8491:13394
bed 1:
chr1 10028 10029 J00118:253:HJ2FTBBXX:3:2213:8491:13394
...
bed 2:
chr14 68314662 68314663 J00118:253:HJ2FTBBXX:3:2213:8491:13394
...
That's really succint!
... and the simplest I would go for. Considering that
join
output provides the ID in the first column, here's a minimum modification to exactly match the desired output:you can use the formatting option of join
-o FORMAT
to achieve the same result ;-)Good to know. Thank you Pierre.
Hey, what if I add the 5th column to each file and still want to join by the same 4th column value and reserve the 5th column information in the final result? Thanks!
Pierre's answer would still work. It'll output columns 4, 1-3 and 5 of the first file, plus 1-3 and 5 of the second file. As Pierre mentioned, you may modify the column layout using the
-o
option. Here's an example that may help you understand hoy join output format works.