Entering edit mode
3.2 years ago
munaj86
▴
30
Hi,
I have two files and I would like to merge them baed on a matched column which is the first column, but I would like to print lines that matched only in one line.
The first file:
tss log2fc prim recu
chr1:11869 0 0 0
chr1:12010 0 0 0
chr1:29570 0.24435624 0.79209 0.940127
The second file:
<PromoterChr>:1000 <PromoterChr> <PromoterStart> <PromoterEnd> <GeneID> <TranscriptID>
chr1:11869 chr1 10869 12869 ENSG00000223972.5 ENST00000456328.2
chr1:12010 chr1 11010 13010 ENSG00000223972.5 ENST00000450305.2
chr1:29570 chr1 28570 30570 ENSG00000227232.5 ENST00000488147.1
The desired output:
<PromoterChr>:1000 <PromoterChr> <PromoterStart> <PromoterEnd> <GeneID> <TranscriptID> tss log2fc prim recu
chr1:11869 chr1 10869 12869 ENSG00000223972.5 ENST00000456328.2 chr1:11869 0 0 0
chr1:12010 chr1 11010 13010 ENSG00000223972.5 ENST00000450305.2 chr1:12010 0 0 0
I've seen different posts where they post join commands that do this but it print everything without discarding the mismatched lines. I want to discard the mismatch line/data and print/merge lines with the matched first column.
Any suggestions?
Thanks,
posted expected output is incorrect given your joining criteria. There are several tools which do the required job. Refer to
tsv-join
orcsvjoin
functions for respective packages.if you want selected column only, in output, try this::
Is this applicable if my files are txt file?
Yes, if they are exactly in the same format as the data you posted.