Hi,
I have ~9.5 TB of data files. Each file contains a network in the format for Gephi. It is numerical data in ASCII format. I want to be able to work on compressed files to do some quantitative surveys on each file and some other analyses.
I would like to get the data down to below 4TB so it can be stored on a single HD at least. I would also like something that is fast , as I will be working with the data continuously. So far I have found lzop (deafult compression) to be the best. Anyone know anything better for this ? Or any advice for working with data like this ?
Thanks , R
Never knew about graph databases. Amazing, thank you.
One of the limitations of neo4j is that you only have one database per installation. The common practice for when you have multiple graphs is to put them all together, and have a flag or property to differentiate them.
so just something simple like concatenate the node defs together and then the edgelists together and combine the two ? could a flag be adding an extra column to each network file with the file ID or something ? Or does neo4j have a way of setting flags ?