Dear all,
Something I've never come across previously has cropped up, and I'd be very grateful for your opinions;
I've been asked to analyse some human RNA-seq data, but due to the nature of this, I need to ensure it is stored on a designated encrypted hard drive supplied by the partner organisation.
Before I agree for them to go ahead and purchase a drive suitable for holding the correct data volume, does anyone foresee any issues with this? My feeling is this is going to significantly increase processing time, but I'm not sure by how much.
~ 100 patient data files with ~ 40,000,000 PE reads each -> mapping, counting, DE analysis
PS - I'm experienced in all of the analysis procedures, my query relates only to the impact that an encrypted drive (rather than my usual SSD setup) would have.
Best wishes,
D
If you are approved to handle the data, and it will be on an approved hard drive that only you will access, does it need to remain encrypted at all times? Is it not sufficient to have anonymised data?
Along this line, if you care about security/privacy, you need to examine the whole pipeline, e.g. where will the unencrypted data be processed? Will it still be secure there? What about intermediate data files?
I'll add that there are an increasing number of 'cryptography-compatible' genomic tools which can handle encrypted streams etc., but its by no means widespread at this stage.
Might be worth doing some googling and seeing if there are encryption-safe tools you can use that already exist.