two files, a and b

Question

How to sort files

0

Entering edit mode

3.1 years ago

Nelo ▴ 20

Hi

I have two files of IDs, both have same ID with different order

test1_ID                          test2_ID
 17547                             14568
 18643                             18643
 14568                             17547
 12407                             47984
 47984                             12407

I want to sort the test2_ID according to test1_ID using command line.

command sort • 1.1k views

ADD COMMENT • link updated 3.1 years ago by shenwei356 8.7k • written 3.1 years ago by Nelo ▴ 20

score 0 · Answer 1 · 2021-11-05

csvtk sort supports user-defined level.

-k, --keys strings keys (multiple values supported). sort type supported, "N" for natural order, "n" for number, "u" for user-defined order and "r" for reverse. e.g., "-k 1" or "-k A:r" or ""-k 1:nr -k 2" (default [1])

-L, --levels strings user-defined level file (one level per line, multiple values supported). format: <field>:<level-file>. e.g., "-k name:u -L name:level.txt"

csvtk sort -H -k 1:u -L 1:test1_ID.txt test2_ID.txt

score 0 · Answer 2 · 2021-11-05

In R, I will do:

two files, a and b

a
  test1_ID value
1    17547     1
2    18643     2
3    14568     3
4    12407     4
5    47984     5
b
  test2_ID value
1    14568     2
2    18643     3
3    17547     4
4    47984     5
5    12407     6
b <- b[order(b$test2_ID,decreasing = T),]
a <- a[order(a$test1_ID,decreasing = T),]

score 0 · Answer 3 · 2021-11-05

You could do it like this, but you lose the header:

join -1 2 -2 1 -t $'\t' -o 1.1,2.1 \
    <(awk 'BEGIN{OFS="\t"}{print NR,$1}' file1 | sort -t $'\t' -k2,2) \
    <(sort file2) \
    | sort -t $'\t' -k1,1g \
    | awk 'BEGIN{FS="\t"}{print $2}'

So here we add an index column to file1, then join by common values, and finally sort by the index column..