VEIL.AI enables the efficient use of sensitive data

The VEIL.AI service enables sensitive data of individuals to be used in e.g. machine learning applications, in commercial use cases and in various research purposes without compromising privacy. VEIL.AI processes data so that the value of collected data sets can be optimised, but ensuring that individuals can no longer be identified. The application can also be used in the production of synthetic data.

Maximising data protection and minimising information loss at the same time

The VEIL.AI developers set out to create a solution that would maximise both data protection and usability, while minimising the information loss as well as the time and computational capacity needed to process data. Traditional methods usually require data to be generalised to such an extent that the potential for the subsequent use of data decreases considerably. In addition, traditional methods are poorly suited to anonymising dynamic, continuously accumulating data types. An especially complicated challenge for the traditional methods is any use case where data comes from several sources, e.g. from several hospitals.

VEIL.AI uses neural networks to speed up the computationally heavy processes required for de-identification. It is orders of magnitude faster than other existing methods. It preserves better data quality and it supports anonymisation of such data types, which could not have been anonymised earlier.

VEIL.AI offers solutions to companies, public organisations and major research projects

Many large companies collect a great deal of data, but make only limited use of it. Analyses of such extensive data sets through methods of machine learning can provide companies with significant value when, for example, developing new types of innovations and services.

However, many companies see the use of their data resources as too challenging or risky, particularly in light of the requirements of the EU’s General Data Protection Regulation (GDPR). The anonymisation tools offered by VEIL.AI support companies in utilising their valuable data.  

In the public sector, utilisation of existing data e.g. in management, business intelligence or as open data or in so called secondary use is considered important. However, data privacy is the paramount requirement in all of these activities. VEIL.AI has already been verified in several activities of this nature.

Research projects must often use data from several sources. To do this, the relevant organisations must usually either share their data with all project partners or select someone – a trusted third party –  to pool the data. With VEIL.AI, multi-partner projects become easier, as raw data no longer needs to be shared in order to be pooled. Instead anonymisation can be done within each organisations so that only anonymised data is shared.

Infograph describing the VEIL.AI process, categories of sensitive data and possible use cases for these data types.

 

The infograph above shows how personal data can be either pseudonymised or anonymised permanently with the anonymisation tools offered by VEIL.AI. Furthermore, the application can be used in the production of synthetic data.  These different data categories can be utilised for various applications, such as drug development, smart city consepts and building predictive models.

VEIL.AI supports various data types

VEIL.AI has been developed at the University of Helsinki’s Institute for Molecular Medicine Finland (FIMM) under the leadership of Janna Saarela and Timo Miettinen. The current team also includes three developers. The responsibility for business development lies with a serial tech entrepreneur Tuomo Pentikäinen.

The VEIL.AI developers have extensive experience in working with patient samples and biobank resources.

“Such research projects have required new tools because no suitable ones have been available. That’s why organisations concentrating on medical research, such as FIMM, are several years ahead of the rest of the world in terms of data protection,” says Tuomo Pentikäinen.

Despite the team’s specialisation in medicine, VEIL.AI can process data in all fields using individual-level data. Some of the recent applications have included location information as well as picture and video data. One of the spearheads of VEIL.AI is production of synthetic data, which is done with the help of Novo Nordisk Foundation funding.

 

This graph shows how VEIL.AI is capable of producing synthetic data that behaves very similarly to the original data.

VEIL.AI is capable of producing synthetic data that behaves very similarly to the original data. In this graph real data (green) is compared to corresponding synthetic data generated by VEIL.AI (yellow).

The commercial potential of VEIL.AI was explored with New Business from Research Ideas funding from Business Finland during 2018-19. Recently VEIL.AI has received funding from EIT Digital, where the team collaborates closely with SciLifeLab from Sweden and Philips from the Netherlands. A patent application of the core technology of VEIL.AI has been filed.

”FIMM and their VEIL.AI team are working on very novel concepts. We are in a joint research project, where we produce  synthetic data and develop metrics and analysis for assessment of quality and usability of synthetic data. This is very important, as synthetic data is  expected to help in some of the most burning problems prevalent in data-intensive HealthTech and drug development. The target of the research is to significantly reduce the lead time of R&D, to reduce or even remove the data breach risks and to improve the quality of data,” says Professor Henning Langberg from the Copenhagen HealthTech cluster and the University of Copenhagen.

First published on 23.11.2018, updated on 25.10.2019.