sequence in different line

0

Entering edit mode

7.7 years ago

Bulbul Ahmed ▴ 20

I have fasta file in this format (one line)

>accession1     GGGGAGCTACGGCAGCGGCGGCGGGGTGCTGCCGCTGGCGTCGCTTAA
>accession2     TTCCGGTAGAAAATCCATTATTGCCAATGGAAGAAGTGA

How will i convert into the below format(seperate line for sequence) using perl script or any other way

>accession1     
GGGGAGCTACGGCAGCGGCGGCGGGGTGCTGCCGCTGGCGTCGCTTAA
>accession2     
TTCCGGTAGAAAATCCATTATTGCCAATGGAAGAAGTGA

RNA-Seq Perl • 2.9k views

ADD COMMENT • link updated 7.7 years ago by GenoMax 150k • written 7.7 years ago by Bulbul Ahmed ▴ 20

1

Entering edit mode

Substitute tab or space with newline use unix tr

ADD REPLY • link 7.7 years ago by Ashutosh Pandey 12k

0

Entering edit mode

which command should i use in rehat??

ADD REPLY • link 7.7 years ago by Bulbul Ahmed ▴ 20

2

Entering edit mode

cat yourinput | tr '\t' '\n' > youroutput.fa

Although we can't see which whitespace is between your accession identifier and the actual sequence.

ADD REPLY • link 7.7 years ago by WouterDeCoster 47k

0

Entering edit mode

thank so much sir. i will try this, hopefully it will work

ADD REPLY • link 7.7 years ago by Bulbul Ahmed ▴ 20

0

Entering edit mode

Maybe sed -r 's#\s+#\n#' input >output then?

ADD REPLY • link 7.7 years ago by Ram 45k

0

Entering edit mode

Bah, I prefer:

sed -r 's|\s+|\n|' input >output

ADD REPLY • link 7.7 years ago by WouterDeCoster 47k

0

Entering edit mode

So, a different delimiter?

ADD REPLY • link 7.7 years ago by Ram 45k

0

Entering edit mode

Exactly ;-)

[just some slight Friday night trolling]

ADD REPLY • link 7.7 years ago by WouterDeCoster 47k

0

Entering edit mode

Strictly speaking, this is not really bioinformatics.

ADD REPLY • link 7.7 years ago by Ram 45k

2

Entering edit mode

I don't know... it seems like an awful lot of bioinformatics is just reformatting text files :)

Personally, in this case, I would copy and paste into Notepad++, which allows search/replace of \t for \n. But then I had never seen "tr" before, so I learned something from the thread!

ADD REPLY • link 7.7 years ago by Brian Bushnell 20k

1

Entering edit mode

tr is good, but I use it more for squeezing consecutive white spaces (tr -s) or for quick deletion (tr -d) than to replace. I prefer sed for all replace operations as it has better granular control.

ADD REPLY • link 7.7 years ago by Ram 45k

0

Entering edit mode

I have fasta file in this format (one line)

Then it's not a FASTA. While it's not a bioinformatics question per se, the OP is at least using a file with sequence information.