To Cloud or Not to Cloud - Privacy implications of genetic data sharing
Data is the new oil has been the slogan of recent years, explaining the tremendous rise in power and the data hunger of tech companies such as Facebook. Emails, contacts, social interactions, online activity, payments, hobbies, interests, location profiles - everything is collected. However there are increasing concerns with the current practices of handling these vast treasures of data, particularly with third-party sharing, its exploitation for unethical purposes and leakage of not sufficiently secured data. These practices leave customers with uncomfortable feelings and spurn initiatives advocating for more privacy and security regulations.
Meanwhile, consumer genetic testing companies are advertising a range of low-priced products offering personalised health or ancestry information from nothing more than a bit of your saliva. Many are wondering: wouldn't one of those kits be a good present for friends, relatives or themselves? Working in the Bio-IT field, I feel that many are underestimating the amount of highly personal information that can be gained from genetic testing, and are therefore overlooking the consequences of sharing their genetic data with these companies. With that in mind, let's take a look at what genetic information can reveal about you.
![enter image description here][3]
I don't think the Venter facial identification article was widely viewed as "successful". It received a lot of flack and as I remember there was a key member at HLI who begged to be left off this paper.
Is this intended as a discussion forum or simply a link to a blog/article type thing?
The intention was to discuss it. That’s why I chose the title: To the Cloud or Not to the Cloud
Do you have experiences/feelings here? In a lot of research projects, the anonymisation is done by just hiding the names. That seems not to be enough.
Yep thats fine :) just wasn't sure from the post whether the intention was for discussion here specifically.
I don't do any work that requires anonymisation personally but my 2 cents would be that the cloud makes a lot of sense for people without access to proper resources or who have variable demands, and personally, I think sufficient anonymisation should be enough - but I don't make the rules. I guess the question is about data security not just whether you can link it to an individual. If you uploaded someones genome, and the cloud server was hacked, that data is no longer private, even if it can never be linked back to the individual, so there seems to be that there is probably an extra burden of responsibility.
I would have thought, inside the EU at least, that GDPR etc may have something to say about this, but whether its extended to genomic data yet (if ever), and not just financial/ID stuff I'm not sure myself.
There were a few recent papers about streaming genomic data in encrypted states for a number of common tools, so perhaps this is the answer, but its still not 100% bulletproof I suppose.
I suppose, if I were reviewing a grant which proposed to use patient data analysed in a cloud environment, I would expect the authors to have used all reasonable means to protect that data, e.g. encryption certainly for upload and download to avoid man-in-the-middle issues, anonymisation, RSA secured servers, encryption-safe tools might be reasonable in the not too distant future, minimum number of users/people with access to avoid social engineering/accidents etc.
Some of that could be overkill, but from what little I know people don't tend to mess about with this kind of stuff - it's serious business. I certainly wouldnt have any trouble keeping a straight face expecting people to anonymise, secure their servers, and probably encrypt stored data wherever possible.
There are two separate issues here.
Using cloud for computing/storage irrespective of the kind of data you are working with. Ways of securing that data by some means (anonymization, encryption etc). First is the way of the future for many applications, whether we like it or not.
As long as there is a second sample (collected under right circumstances with proper legal authorization), anyone can be identified. Even if you chose to not participate in cloud sharing enough of your relatives may. So you can't expect to remain completely anonymous.