paste columns into one line in command line
2
0
Entering edit mode
8.0 years ago
newbiebio ▴ 80

I have a txt files(very huge one) about genes, the txt file contains gene names, gene entrez_id, chrom,start, end and sequence columns. I want to put gene names, gene entrez_id, chr, start and end columns into one, and sequence will be in a new line. Basically, I want to convert txt file to fasta format with Linux command. ex: > SAMD11_148398_chr1_879534_879961 GGTTGC

I tried to use online converter, but my file is so huge, so it will be good to use command line to convert.

Linux • 2.1k views
ADD COMMENT
0
Entering edit mode

Did you try anything? If yes, please show us, people here try to correct your code. If not, try awk.

ADD REPLY
0
Entering edit mode

I tried some online tools, but it failed. I was thinking use awk. And Pierre's just gave me exactly what I wanted.

ADD REPLY
0
Entering edit mode

I would suggest you to write an example of input and output lines. Just to let the people figure out more easily what are you expecting :)

ADD REPLY
0
Entering edit mode

Thanks to both of you.very efficient

ADD REPLY
0
Entering edit mode

Please use ADD REPLY/ADD COMMENT when responding to existing posts to keep threads logically organized.

Remember to accept one (or more) answers as correct (use the check-mark symbol against the answer).

ADD REPLY
3
Entering edit mode
8.0 years ago
awk '{printf(">%s_%s_%s_%s_%s\n%s\n",$1,$2,$3,$4,$5,$6);' input.txt > out.txt
ADD COMMENT
0
Entering edit mode

Thank you, Pierre. I used the code. It works.

ADD REPLY
0
Entering edit mode

Pierre, isn't it required to close } ?

Wondering how it worked for OP without closing the {.

ADD REPLY
1
Entering edit mode
8.0 years ago
Tao ▴ 540

If your input format is like ">SAMD11_148398_chr1_879534_879961 GGTTGC", then use the following command:

awk '{print $1"\n"$2}' input_file > output_file

But if your input is like "SAMD11 148398 chr1 879534 879961 GGTTGC" and you want to convert it to two lines:

>SAMD11_148398_chr1_879534_879961
GGTTGC

Use Pierre's answer!

ADD COMMENT
0
Entering edit mode

I used Pierre's answer. It works. Thank you also.

ADD REPLY

Login before adding your answer.

Traffic: 1359 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6