I need to store ~7 million of unique amino acid sequences in MySQL. Till now, I was storing sequence as TEXT type. Is there any other way of coding protein sequence in MySQL so it will take less space?
EDIT
Mentioned table of proteins is webserver backed. One of the functionalities of webserver is blast search, so all proteins are going to be compiled into blast db, anyway.
Do you think querying blastdb by fastacmd will be more reasonable than storing all data in MySQL? I haven't tried that so far but fastacmd is quite fast.
Webserver is running under Apache with jQuery and mod-python.
If all you want to do is to retrieve a sequence by name, fastacmd is fine. At the same time, I do not see a particular problem to store 7 million sequences in MySQL as long as you do not try to index the sequences.
If all you want to do is to retrieve a sequence by name, fastacmd is fine. At the same time, I do not see a particular problem to store 7 million sequences in MySQL as long as you do not try to index the sequences. Nonetheless, this may not be the best strategy.