how split multiple fasta file into multiple file
2
0
Entering edit mode
5.1 years ago
praasu ▴ 40

Hi,

I would like to split multiple fasta file into multiple files where name of each file should be header name without ">" symbol.

For example:

>contig1
ATACTCTAATTATTA
>contig2
ATACTCTAATTATTA
>contig3
ATACTCTAATTATTA
>contig4
ATACTCTAATTATTA

files should contig1 contig2 contig 3

Number of configs are approximately 20,000.

I found a couple of awk script but it doesn't seem to work. If you feel it is redundant query, I apologize.

Thank you

R sequence RNA-Seq Assembly genome • 910 views
ADD COMMENT
2
Entering edit mode
5.1 years ago
ATpoint 85k

There you go:

cat test.fa 
>contig1
ATACTCTAATTATTA
>contig2
ATACTCTAATTATTA
>contig3
ATACTCTAATTATTA
>contig4
ATACTCTAATTATTA

## basically as in: https://stackoverflow.com/questions/11818495/split-a-fasta-file-and-rename-on-the-basis-of-first-line
awk '/^>/ {OUT=substr($0,2) ".fa"}; OUT {print >OUT}' test.fa

ls
$ contig1.fa  contig2.fa  contig3.fa  contig4.fa  test.fa
ADD COMMENT

Login before adding your answer.

Traffic: 2573 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6