I have two text-files containing the abundances of genes in samples. However, one of the files measures the abundance of a greater variety of genes than the other, and therefore cannot be completely concatenated. Thus, I'm trying to concatenate the lines from the larger file that share an identical gene index as lines from the smaller file, such that:
>df1
Sample1 Sample2 Sample3
Gene1 0.001 0.002 0.003
Gene2 0.001 0.002 0.003
Gene3 0.001 0.002 0.003
>df2
Sample4 Sample5 Sample6
Gene1 0.001 0.002 0.003
Gene1.1 0.001 0.002 0.003
Gene2 0.001 0.002 0.003
Gene2.1 0.001 0.002 0.003
Gene3 0.001 0.002 0.003
>df1and2
Sample1 Sample2 Sample3 Sample4 Sample5 Sample6
Gene1 0.001 0.002 0.003. 0.001 0.002 0.003
Gene2 0.001 0.002 0.003 0.001 0.002 0.003
Gene3 0.001 0.002 0.003 0.001 0.002 0.003
Suggestions in Python or Bash are both welcome. Thank you!
I've removed tags such as
genetics
,genes
andbioinformatics
. The last tag makes no sense - EVERY QUESTION here is related to bioinformatics.