Entering edit mode
5.3 years ago
Angie11
•
0
Hello, Does anyone know of a tool to convert a large .txt file (~850MB) into a .FASTA file?
I converted a FASTA sequence file into a txt file using the fasta_formatter of the FASTX tool on Linux so I could combine the accession numbers with descriptions from another txt file using a join command. Now it is ~850MB and I need to convert it back to a FASTA file.
Thank you, Angie
Can you give an example of how the file looks like?
Formatted by @RamRS:
Original content:
Bacteroides genus root|cellular organisms|Bacteria|Bacteroidetes/Chlorobi group|Bacteroidetes|Bacteroidia|Bacteroidales|Bacteroidaceae|Bacteroides no rank|no rank|superkingdom|superphylum|phylum|class|order|family|genus10_GL0085768 [gene] locus=scaffold30815_5:6223:7791:- [Complete] codon-table.11MNSEIERRRTFAIIAHPDAGKTSLTEKLLLFGGQIQVAGAVKSNKIKKTATSDWMDIEKQRGISVTTSVMEFDYNDYKINILDTPGHQDFAEDTYRTLTAVDSVIIVVDGAKGVETQTRKLMEVCRMRNTPVIIFVNKMDREAKDPFDLLDELEEELIINVRPLTWPIESGPRFKGVYNLYEHKLNLFQPSKQVVTEKVEVDINTEELDNQIGAPLAEKLRGELELVDGVYPEFNVEEYLKGEMAPVFFGSALNNFGVQELLDTFVEIAPSPRPTKTEEREVEPDEPKFTGFVFKITANIDPNHRSCIAFCKICSGKFSRNTPYYHVRHDKTMRFSSPTQFMAQRKTTVDEAWAGDIIGLPDNGTFKIGDTLTEGEKLHFRGIPSFSPEMFKYIENADPMKQKQLAKGIDQLMDEGVAQLFINQFNGRKIIGTVGQLQFEVIQYRLENEYNAKCRWEPISLYKACWVESDDPEELEAFKKRKYQYMAKDREGRDVFLADSNYVLQMAQMDFKHIKFHFTSEF 1/1Bacteroides vulgatus species root|cellular organisms|Bacteria|Bacteroidetes/Chlorobi group|Bacteroidetes|Bacteroidia|Bacteroidales|Bacteroidaceae|Bacteroides|Bacteroides vulgatus no rank|no rank|superkingdom|superphylum|phylum|class|order|family|genus|species10_GL0085769 [gene] locus=scaffold30815_5:7798:8655:- [Complete] codon-table.11MKNILVTGANGQLGNEMRVLSAEYKEYTCFFTDVAELDICDEQAVMTFVKENNIHVIVNCAAYTAVDKAEDDIELCTKLNKNAVSYLAKAAEANWGEFIQISTDYVFDGTKHLPYNEGDVPCPNSVYGKTKLAGETNALEYCKKTMIIRTAWLYSTFGNNFVKTMLRLGKEKETLGVVFDQIGTPTYARDLARAIFTAIYKGVVPGVYHFSDEGVCSWYDFTKAIHRIAGITTCKVSPLHTNEYPAKAPRPHYSVLDKTKIKTTYNIEIPHWEESLEACIKELNA
@RamRS, What command did you use to reformat this? Thank you!
The code option in the formatting bar (
10101
button).I see, thank you. I was wondering if there is a command in Linux to reformat a file that is in the messy format above into 2 lines like RamRS did? - Essentially do what the 10101 button does but through the command line, on a large txt file
input:
output: