Question

How to arrange column one below another from side by side in a matrix?

0

Entering edit mode

4.5 years ago

sankadinesh ▴ 20

I have a distance matrix file (Excel format) like this. I want to arrange columns to one below another from side by side. I have attached example file. I will apply your suggestions to larger dataset.

https://drive.google.com/file/d/1Lyms3Vpe9mVwLuIqk50JaS3wh3tACN97/view?usp=sharing

Please give your inputs and suggestions. Thanks in advance

Regards, Dinesh S L

sequence gene sequencing R • 1.2k views

ADD COMMENT • link updated 4.5 years ago by rpolicastro 13k • written 4.5 years ago by sankadinesh ▴ 20

0

Entering edit mode

Use awk, not R. Save the file as a TSV and work on that. For each column, save that column's header in a variable and for each non blank row in that column, print the header-variable followed by the "cell" content.

Try the above yourself and let us know if you have specific questions.

ADD REPLY • link 4.5 years ago by Ram 44k

score 1 · Answer 1 · 2020-08-22

1

Entering edit mode

4.5 years ago

Tm ★ 1.1k

Assuming you have tsv file.

This should work for file with no header in all columns:

awk  'BEGIN {ORS=""} { for (i=1; i<=NF ; i++) {my_dict[i]=my_dict[i]$i"\n"}} END { for (k in my_dict) {print my_dict[k]}}' matrix.txt

However, if there is header in all columns and you want to skip it then you can use it with small modification:

awk 'BEGIN {ORS=""} NR>1 { for (i=1; i<=NF ; i++) {my_dict[i]=my_dict[i]$i "\n"}} END { for (k in my_dict) print my_dict[k] }' matrix.txt

It basically loop over the array's keys and extract the corresponding values.

ADD COMMENT • link 4.5 years ago by Tm ★ 1.1k

0

Entering edit mode

Your solution worked like charm. However the code starts to read from column 23 instead of 1. I have to cut and paste to get it in right order. Thanks a lot Dinesh

ADD REPLY • link 4.5 years ago by sankadinesh ▴ 20

0

Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they all work.

Upvote|Bookmark|Accept

ADD REPLY • link 4.5 years ago by Ram 44k

score 1 · Answer 2 · 2020-08-22

Using awk is definitely the way to go if this is all you need to do, but I figured I would post a little R solution since you asked, and in case you or anyone wanted it.

Example data.

df <- structure(list(X1 = c(34L, 2L, 88L, 90L, 52L), X2 = c(NA, 74L, 
80L, 67L, 44L), X3 = c(NA, NA, 65L, 46L, 73L), X4 = c(NA, NA, 
NA, 73L, 71L), X5 = c(NA, NA, NA, NA, 50L)), class = "data.frame", row.names = c(NA, 
-5L))

> df
  X1 X2 X3 X4 X5
1 34 NA NA NA NA
2  2 74 NA NA NA
3 88 80 65 NA NA
4 90 67 46 73 NA
5 52 44 73 71 50

tidyverse solution

library("tidyverse")
library("readxl")

# To read in your excel file.
# df <- read_excel("path/to/file")

df <- df %>%
  pivot_longer(everything(), names_to="column", values_to="value") %>%
  drop_na %>%
  arrange(column)

You should then have a data.frame from wide to long format.

> df
# A tibble: 15 x 2
   column   val
   <chr>  <int>
 1 X1        34
 2 X1         2
 3 X1        88
 4 X1        90
 5 X1        52
 6 X2        74
 7 X2        80
 8 X2        67
 9 X2        44
10 X3        65
11 X3        46
12 X3        73
13 X4        73
14 X4        71
15 X5        50