Change rows of two different files into columns and paste them alternatively in a new file in shell scripting
2
0
Entering edit mode
2.5 years ago
Sarah • 0

I have 2 files, both with 10144 rows and 105 columns. Two matrices with sequencing TPM values for 105 samples with first column as gene names. There are 10144 genes (rows) for 105 samples (columns)

I want the new file to have 105 rows (columns transposed into rows) and the genes as columns. Same genes from both files be placed alternatively (side by side) separated by tab, making it 20288 columns. for eg

A2ml1(file1)\t\A2ml2(file2)\t\A3galt2(file1)\t\A3galt2(file2)... 

Please help

UNIX shell linux • 776 views
ADD COMMENT
0
Entering edit mode
2.5 years ago

input:

$ tail -n+1 *.txt
==> file1.txt <==
gene1   2   4
gene2   3   5
gene14  9   10

==> file12.txt <==
gene2   3   5
gene1   2   4
gene14  9   10

==> file2.txt <==
gene1   6   8
gene14  7   8
gene2   5   7

output:

$ awk '{$1=$1" "FILENAME}1' *.txt  | sort -k1,1V -k2,2V | awk '{$1=$1"_("$2")";$2=""}1' | rs -T

gene1_(file1.txt)    gene1_(file2.txt)    gene1_(file12.txt)   gene2_(file1.txt)    gene2_(file2.txt)    gene2_(file12.txt)   gene14_(file1.txt)   gene14_(file2.txt)   gene14_(file12.txt)
2                    6                    2                    3                    5                    3                    9                    7                    9
4                    8                    4                    5                    7                    5                    10                   8                    10
ADD COMMENT
0
Entering edit mode
2.5 years ago

datamash transpose , sort and join

ADD COMMENT

Login before adding your answer.

Traffic: 1906 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6