mutiple fasta to single fasta
2
0
Entering edit mode
3.2 years ago
setschmann ▴ 10

i have a huge reference genome with a lot of contigs, it looks something like this.

>aalba5_s00000010
TTGTCTGCTTCACAGTACAGCTAGAAAATTATGAATTCATTTCCCCACATCAAGCAACCCCTGCTTATTC
>aalba5_s00000011
ACTTGGAATGGGATCTTGTTGGGGGGCCAACAGAACCATAAGGGCAATGGCTGCAATCTTTGATAAGATC
>aalba5_s00000012                                                                                                                                                                                                                        
TGTAGCAAACAGCTACGGAAAAATTTTAAAAATTTTCGAAATTTAAATCTGGGGTTCCCTTTCCTGTGTA 
GATGTATTCCCTTTTTAAAGGTTTTCCTAGGACTTGCAGTCATTAATGAGACGTCTTCTCATGATATCCT
AATTTTTGGAAGATGCCTCCTACATCAGGAATCTTTGCTGCCACTTGTCTCTTTCATCAGCCAGATGTCT

how can i subset this that i have a file each with the filename of the name of the contig (examplea alba5_s00000010.fa) conatining its sequence?

tanks for the help

fasta reference visualisation genome • 1.1k views
ADD COMMENT
0
Entering edit mode

You can try below Python code for your file

from bioinfokit.analys import Fasta
Fasta.split_fasta(file='seq.fasta', n=3)

Replace the n with number of sequences in your file. Read more here https://www.reneshbedre.com/blog/filereaders.html#split-fasta-file-into-multiple-fasta-files

ADD REPLY
1
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 1780 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6