Hello, I'm a molecular biology undergraduate interested in bioinformatics.
I started this project where I need as much HIV1 I could, I downloaded the whole seq available in HIV.lanl however they come as a huge single multifasta file and I need each seq in a seperate file for my analysis. downloading each seq 1 by 1 would take forever (given its 12K), I tried the EMBOSS split tool but uploading 400MB files crashes every time at around the 10%. I searched for some manual ways but with my little to none programming skill (yet) I could figure it out on my own.
I came across this python script https://github.com/ramsainanduri/split_multi_fasta but its a script and despite playing with it for a few days I couldn't figure out how to edit it such that it takes my HIV1db.fasta as input and split it (I'm a windows user and installed python and know how to open the script in shell).. I have checked the similar posts on the forum too but none breaks it down to no-low programming educated one.
If someone could take the time and effort to explain it as to a retard I would totally appreciate it, especially the part how I set the path of the input part :(
I find it hard to believe that you did not find a suitable answer here on the forum.
some examples:
in general search for "split multifasta file" , "convert multi to single fasta file" ....