Genetic data privacy and can we protect it?

By Eva Istsenko

In the past few years, genetic- testing companies, like 23andMe or Ancestry have become more and more popular. Their main service compares your DNA with already sequenced DNA from 14,000 individuals across 45 populations and composes a comprehensible, graphical percentage report. They also offer overall health analyses that include various condition predispositions and carrier status reports. 

In July 2018, GlaxoSmithKline pharmaceuticals made a 300$ million deal with 23andMe for a 4-year project, giving them access to more than 5 million genetic exemplars. The goal of this collaboration is “novel drug development” (Brodwin, 2018). This raised privacy and anonymity concerns, since genomic data differs from other types of information, thus requiring alternative protection.  Although 23andMe clients consented for their data to be used in medical research, it is unlikely that anyone knew what exactly this meant and how much money will be made off them. Furthermore, when they sent off their sample, their blood relative most likely did not agree for their data to be collected. 

Our genome sequence consists of about 3 billion bases and 2 to 4 million bases differ from person to person (genomic variants), giving us specific characteristics. Those variants can either be single base (single nucleotide polymorphism- SNPs) or longer sequences. Of course, having a larger chunk of sequenced DNA will tell you more than just several SNPs, but with the growing amounts of various types of data- economic, electronic, other health data, it is increasingly harder to reach full anonymity. When several types of data are combined and different analytical techniques applied, the possibility of re-identification is high. 2004 research showed that it is possible to identify any individual from public SNP data and an individual’s known 30-80 SNPs, which was an important development, demonstrating how little information is needed to target a specific person. Moreover, genomic data stays relatively the same for a long period of time, and remains relevant for individuals and generations, growing in value overtime. New techniques appear and contribute to the richness of the data, but this does not make ‘old data’ worthless, so there is no time limit constraining any genome- based research (Finnegan, 2017). However, it is important to note that, on average, one kit costs 49-69$, so any investigation done on 23andMe data-set would only represent middle- or high-class populations, once again excluding poorer members of society from genetic treatment opportunities.

It is ethically problematic to reveal someone’s identity without their permission and especially combine genomic data with other datasets. These results can reveal hospital admissions, family health history, internet searches, financial state and so on, possibly leading to employability, insurance, loan bias, as well as overall stigmatization and embarrassment. This poses a challenging task for law-makers- keeping a balance between two parties of interest: private (individuals preferring to keep their data unidentified) and public (commercial entities invested in medical research). We can also say that it is in the best interest of an individual to voluntarily share their data with the public because they, as part of the public, will benefit as well. On the other hand, it is of public concern to keep private information safe, contributing to a well-functioning, trusting society (Finnegan, 2017).

Statistics published by the Wellcome Trust UK showed that 75% would be willing to give their data for medical research, 22% are unwilling and 4% don’t know. The survey showed that young people are more willing to share their genomic data, perhaps due to having less life experiences they would prefer to keep private (Finnegan, 2017). The attitudes are constantly changing depending on current political situations, health care state and new technology, thus requiring constant policy updates. 

There is no easy way to solve this, but measures can be taken. Firstly, we shouldn’t rely only on anonymisation- further steps should be imposed, like legal sanctions or limited access. Secondly, higher transparency- the public should be aware of the risks, implications and effectiveness of anonymisation techniques, especially if the research is done with public funds. Finally, there should be political change, perhaps starting with establishing a data ethics council that would constantly review and adapt current policies, as well as pursue further research into potential harms and benefits of genetic data manipulations (Finnegan, 2017). 


Finnegan, t. 2017. [online] Available at: <; [Accessed 2 October 2020]

Brodwin, e. 2018. DNA-Testing Company 23Andme Has Signed A $300 Million Deal With A Drug Giant. Here’s How To Delete Your Data If That Freaks You Out.. [online] Business Insider. Available at: <; [Accessed 2 October 2020]

Ducharme, J. 2018. A Major Drug Company Now Has Access To 23Andme’S Genetic Data. Should You Be Concerned?. [online] Time. Available at: <; [Accessed 2 October 2020]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s