Bioinformatics; encrypted hard drive?
2
2
Entering edit mode
5.3 years ago
d.p.tonge ▴ 20

Dear all,

Something I've never come across previously has cropped up, and I'd be very grateful for your opinions;

I've been asked to analyse some human RNA-seq data, but due to the nature of this, I need to ensure it is stored on a designated encrypted hard drive supplied by the partner organisation.

Before I agree for them to go ahead and purchase a drive suitable for holding the correct data volume, does anyone foresee any issues with this? My feeling is this is going to significantly increase processing time, but I'm not sure by how much.

~ 100 patient data files with ~ 40,000,000 PE reads each -> mapping, counting, DE analysis

PS - I'm experienced in all of the analysis procedures, my query relates only to the impact that an encrypted drive (rather than my usual SSD setup) would have.

Best wishes,

D

RNA-Seq encryption • 1.2k views
ADD COMMENT
0
Entering edit mode

If you are approved to handle the data, and it will be on an approved hard drive that only you will access, does it need to remain encrypted at all times? Is it not sufficient to have anonymised data?

ADD REPLY
1
Entering edit mode

Along this line, if you care about security/privacy, you need to examine the whole pipeline, e.g. where will the unencrypted data be processed? Will it still be secure there? What about intermediate data files?

ADD REPLY
0
Entering edit mode

I'll add that there are an increasing number of 'cryptography-compatible' genomic tools which can handle encrypted streams etc., but its by no means widespread at this stage.

Might be worth doing some googling and seeing if there are encryption-safe tools you can use that already exist.

ADD REPLY
2
Entering edit mode
5.3 years ago
JC 13k

If the encryption level is at the drive mounting point, I had never noticed a big difference in running any analysis on encrypted data. Not sure if this is your case, but definitively if you need to decrypt->analyze->encrypt each sample, you can expect some additional time for this.

ADD COMMENT
1
Entering edit mode
5.3 years ago

If I understand your question correctly, the data simply needs to be stored on an encrypted hard drive. If that's the case, then the latency from the intermediate encryption and decryption steps should be almost negligible. If at all possible, you should definitely try to get an encrypted SSD drive, but other than that, you should be okay. Intel processors since 2010 have an AES instruction set, and newer computers can have dedicated crypto coprocessors.

Newer fancy enterprise-level drives can have neat features like cache-line optimization so you may even get a speedup in processing, but this is 100% speculation on my part. The only true way of determining the computational tradeoffs is really to profile.

ADD COMMENT

Login before adding your answer.

Traffic: 1968 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6