Hi,
I have no experience with SQL. But I am interested in gaining skills with databases and I may have an opportunity to do so now.
I have a HUGE BLAST output - almost 1TB. I know it probably depends on what I want to do, but should I put my BLAST results into a database?
What are the advantages to this , or when would one want to do this?
Thanks
As you said, first think over what you want to do with these hits, and then decide whether to keep them in a database.
Dumb question, but what is your blast format output? Look up https://molevol.mbl.edu/wiki/index.php/BLAST_UNIX_Tutorial . The default output format is really wordy, for large queries the tabular is way better.
As for saving your Blast results in a database, look up OrthoMCL (orthologous gene search) or Trinotate. Each one blasts a genome (~20000 sequences) against a very large reference database (~1M seqs), and processes the output for scoring (OrthoMCL) or correlation with other sources of information (Trinotate).
For simple scripting with manageable amounts of data, I just use the linux 'join' command. Configuring a database gets old quickly. You can also look up 'makeblastdb' and 'blastdb', but a Blast database is in this sense just a index/header/sequences triplet of files, which is different from SQL.