The value of data
In May 2018 the General Data Protection Regulation (GDPR) will come into effect. Applicable to the entire EU, its aim is to protect the individual rights of citizens while guaranteeing free and secure movement of personal data within the EU. The GDPR is not bad, but it is complex, says André Dekker, professor of Clinical Data Science at MAASTRO Clinic. And so, to err on the safe side, medical data are kept under lock and key. So concerned are we about privacy protection that we’ve lost sight of the value of the data. “We need to be careful not to throw the baby out with the bathwater.”
Dekker has a dream. There will come a day when the consultation between doctor and patient on the right course of treatment will be supported by an objective data system. Would it be better to opt for the most intensive treatment, with all the ensuing side effects? Or for the best quality of life? The system will then indicate, immediately and accurately, which treatment will yield the best result. Currently no such method exists, and the average doctor and healthcare worker are drowning in a sea of clinical data (personal data, diagnoses, imaging, DNA data), medical decisions and irrelevant evidence.
As a result, treatment outcomes are actually extremely uncertain, Dekker says. “You may as well toss a coin to see whether a specific cancer patient will still be alive after x years. So doctors don’t know in advance which treatment is best.” The answer, he believes, lies in artificial intelligence. The systems for making connections between structured data are already in place. There’s only one big problem waiting to be solved: clinical data are spread across thousands of hospitals and, thanks to legislation, aren’t easy to share.
The new European data regulation doesn’t do much to help the situation. Dekker is relatively positive about the GDPR, but many terms need clarification. Take the requirement of ‘de-identification’: the anonymisation of data. This is important, because with anonymous data you can do what you want. But the GDPR does not specify how far this process needs to go. “Images and DNA characteristics can be traced back to an individual person. Are these data still anonymous if there’s only one person they refer to?” The regulation contains many such unclear terms. “The risk is that you take a very conservative approach and, especially in the initial phase, allow nothing at all.” That, in his view, would be the worst application of the European regulation.
André Dekker (1974) is professor of Clinical Data Science at Maastricht University. His research focuses on the development of global data-sharing systems for machine learning by personalised models that can predict cancer outcomes after radiotherapy. Dekker is head of the Department of Research and Education at MAASTRO Clinic and leads the GROW-Maastricht University research division of MAASTRO Knowledge Engineering.
As Dekker sees it, the rationale behind the GDPR is faulty. “Asking patients’ consent started as a way of preventing physical harm during medical procedures, or properly justifying them if necessary. That requirement has now also been applied to privacy risks. Which is strange, because the chance of physical harm as a result of a data leak is minimal. It’s of a completely different order than causing a disability.” The balance is skewed towards privacy protection, which makes it easy to lose sight of the value of the data. “Hospitals go into lockdown and there’s no way to learn from one another which treatment works best. Ultimately it’s to the detriment of science, healthcare and society. It is worth the price?”
Dekker sees the GDPR as a missed opportunity. He’d rather see a law focused on what can – instead of what can’t – be done with the data. Why do cancer researchers get the data they ask for, while healthcare insurers don’t get the data they need to identify fraudsters? “That’s an ethical question more important than merely controlling the data. How do you set up a system in which you can weigh up these kinds of considerations? That would make for a much broader legal basis than what we have now.”
To solve the problem of the unavailability of data, Dekker came up with a concept of data exchange: the Personal Health Train (PHT). The idea is that if the mountain won’t come to Mohammed, Mohammed will have to come to the mountain – that is, instead of bringing the data to the research, the research goes to the data. The research question is sent to ‘visit’ various data stations (e.g. hospitals), collecting the data it needs in coded form. For example, what are the predictors of lung-cancer outcomes? Having completed its journey, the train comes back with an answer: age x sex + 3 x the size of the tumour - 2 x a certain gene. “That way we learn from other people’s data without having to actually move those data.”
The advantage of the PHT is that it avoids the privacy issue entirely. Because no data leave the hospital, there is no risk of a privacy breach. And Dekker doesn’t expect the new regulation to pose an obstacle. On the contrary: the GDPR offers opportunities for many more research trains to be developed. “We already see that happening in the Netherlands. And the fact that we’re leading the way in this respect is a win-win situation. We’re setting an example, showing Europe that this is the right approach.”