Not specifically a Bioinformatics question, but an issue that comes up in Bioinformatics often.
Say I generate a very large .tsv or .csv formatted table with information, e.g. variant annotations, and want to deliver it to non-bioinformatics collaborators for review. Beyond a couple hundred MB or a few 100k entries, Microsoft Excel begins to have trouble opening and working with the file.
Are there any simple alternatives to help make data like this easier for desktop users to deal with?
You can try XLSB. When files get really large and complex, we have external web applications that allow scientists to search, query, visualize, and do simple statistics.
I've never heard of xlsb before. How do you get your .tsv/.csv data into that format? Google searching is only showing ways to convert out of it to other formats. From what I can tell, the only way to actually get your data into that format is to open it in Excel first, which is what I am trying to avoid. Is there another way?
No. Not that I'm aware of. The source file needs to be opened by Excel, or scripted using
pywin32
. Once the data are in xlsb, the consensus I heard is that files can be read noticeably faster.