Many organisations that collect personal data deploy anonymisation techniques so as to store data for longer, use it for different purposes from which it was collected or even sell the anonymised data more freely than they otherwise can under data protection laws.
EU data protection laws and regulators accept that 100% anonymisation of data may not be possible, particularly in light of the increasing availability of data online and improvements in computer science and data analytics. The UK's Information Commissioner Office (ICO) has, for example, explained that the Data Protection Act "does not require anonymisation to be completely risk free" and that organisations that anonymise personal data can disclose that information even if there is a "remote" chance that the data can be matched with other information and lead to individuals being identified.
However, the Article 29 Working Party has said that anonymising techniques will not protect privacy where the underlying data can be linked with other available information about individuals. Therefore organisations must continually check for risks that individuals behind the data can be re-identified after the initial anonymisation measures have been put in place, it said.
"Data controllers should consider that an anonymised dataset can still present residual risks to data subjects," the Working Party said in an opinion on anonymisation techniques (37-page / 779KB PDF). "Indeed, on the one hand, anonymisation and re-identification are active fields of research and new discoveries are regularly published, and on the other hand even anonymised data, like statistics, may be used to enrich existing profiles of individuals, thus creating new data protection issues. Thus, anonymisation should not be regarded as a one-off exercise and the attending risks should be reassessed regularly by data controllers."
Data protection law expert Marc Dautlich of Pinsent Masons, the law firm behind Out-Law.com, said: "Anonymising personal data is much harder than many businesses probably realise. The perception has often been that anonymisation can be a one-off process and that merely encrypting data or otherwise destroying or manipulating fields of data is sufficient to de-personalise data and take it outside the scope of data protection laws. However, the Working Party has made it clear that that is not the case."
"The opinion acts as a readers' guide to the different anonymisation techniques but also explains that the process of anonymising data is dynamic and that organisations need to regularly re-check that their datasets are anonymised, not because the datasets themselves have changed, but because the world around it is evolving," Dautlich said.
The Article 29 Working Party warned of the risks of being able to single out individuals' data from an anonymised dataset, re-identify individuals on the basis of linking data from within a dataset or across different datasets, or deduce with "significant probability" the identity of individuals on the basis of inferences that can be made from viewing data. The various anonymisation techniques that can be deployed to protect against those risks vary in their robustness, it said.
The watchdog said companies can use measures such as randomising data to "remove the strong link between the data and the individual" or generalising it so as dilute the specificity of data.
One specific technique it said that can be used to anonymise data is 'noise addition', which is the modification of data to make it less precise and therefore less likely to be linked to an individual.
However, the Working Party said that it was wrong to assume that deploying 'noise addition' alone is sufficient to protect privacy. It cited researchers' ability to re-identify most individuals whose data had been published in an apparently anonymised format by video content giant Netflix. The company had disclosed its customers' ratings of more than 18,000 films.
The Working Party also flagged up limitations in the effectiveness of other anonymisation techniques, such as 'permutation', which is where some data about individuals is "artificially linked to different data subjects". It said randomly linking certain data to different people will not protect privacy where "logical links exist between different attributes" in a database.
Businesses should decide on a case-by-case basis which specific anonymisation techniques are suitable for them to use and reach those decisions with awareness of the limitations of each of those techniques to protecting privacy in mind, the Working Party said. More than one measure may have to be deployed in many of the cases, it said.
"The opinion, whilst useful to an extent, fails to provide any concrete answers to some of the vital questions businesses face when anonymising data," Dautlich said. "For example, in the context of the ICO's guidance, there remains uncertainty about what constitutes a 'remote' chance of re-identification. It is also unclear what the boundaries are for checking data is anonymised and the extent and type of testing that needs to be done. Fundamentally, in an environment where more and more data is being made publically available, organisations remain unclear on what they need to do to fully anonymise data."
The Working Party also warned that pseudonymisation, which is the substitution of one identifying piece of data with a generic or non-identifying value, is no substitute for proper anonymisation. In particular it said that a common fault of businesses is that they will use "the same key" for pseudonymising data across different datasets.
"Data controllers often assume that removing or replacing one or more attributes is enough to make the dataset anonymous," it said. "In many cases it can be as easy to identify an individual in a pseudonymised dataset as with the original data. Extra steps should be taken in order to consider the dataset as anonymised, including removing and generalising attributes or deleting the original data or at least bringing them to a highly aggregated level."
Dautlich said: "Data is either anonymised or it is not. There is no middle ground. The Working Party is clear that pseudonymisation, as opposed to anonymisation, whilst reducing the likelihood of individual identifications, does not take the data outside of the scope of data protection laws."
"Businesses must therefore be aware that pseudonymising data cannot help them overcome restrictions such as around the transfer of personal data outside of the European Economic Area, nor their need to obtain individuals' consent to proceed with processing of the data in certain cases," he said.