Question

How to make large .tsv/.csv files more accessible for desktop users?

0

Entering edit mode

6.8 years ago

steve ★ 3.5k

Not specifically a Bioinformatics question, but an issue that comes up in Bioinformatics often.

Say I generate a very large .tsv or .csv formatted table with information, e.g. variant annotations, and want to deliver it to non-bioinformatics collaborators for review. Beyond a couple hundred MB or a few 100k entries, Microsoft Excel begins to have trouble opening and working with the file.

Are there any simple alternatives to help make data like this easier for desktop users to deal with?

variants • 2.8k views

ADD COMMENT • link updated 6.8 years ago by Devon Ryan 104k • written 6.8 years ago by steve ★ 3.5k

0

Entering edit mode

You can try XLSB. When files get really large and complex, we have external web applications that allow scientists to search, query, visualize, and do simple statistics.

ADD REPLY • link 6.8 years ago by Eric Lim ★ 2.2k

0

Entering edit mode

I've never heard of xlsb before. How do you get your .tsv/.csv data into that format? Google searching is only showing ways to convert out of it to other formats. From what I can tell, the only way to actually get your data into that format is to open it in Excel first, which is what I am trying to avoid. Is there another way?

ADD REPLY • link 6.8 years ago by steve ★ 3.5k

0

Entering edit mode

No. Not that I'm aware of. The source file needs to be opened by Excel, or scripted using pywin32. Once the data are in xlsb, the consensus I heard is that files can be read noticeably faster.

ADD REPLY • link 6.8 years ago by Eric Lim ★ 2.2k

score 1 · Answer 1 · 2018-02-12

1

Entering edit mode

6.8 years ago

Devon Ryan 104k

I question whether it makes sense for users to scroll through a multi hundred MB spreadsheet. It might be nice to make a little shiny app for filtering through and interacting with such things. We've taken to doing that with some scRNAseq data, so end users can color their tSNE maps with their genes of interest post hoc.

ADD COMMENT • link 6.8 years ago by Devon Ryan 104k

0

Entering edit mode

Yes scrolling is definitely not feasible. Basic column filtering in Excel can allow end-users to do some of their own simple data querying & metrics calculations, assuming Excel does not crash first. R Shiny is a good idea though. Do you self-host a web version for them, or do you just give them the script with instructions to install and run the app themselves?

ADD REPLY • link 6.8 years ago by steve ★ 3.5k

1

Entering edit mode

I set up a shiny server that gives them access to a few little apps that we've found useful. It's still not heavily used, but it's good to have handy for cases like this. For a few hundred megs you can also allow them to upload files.

ADD REPLY • link 6.8 years ago by Devon Ryan 104k