Using personal data to understand aging
A new six-year research project shows how patient data can help progress scientific research while maintaining personal data privacy.
When we age, we know that our bodies deteriorate, and it makes aging the most important single risk factor of so-called degenerative diseases. But what we don’t understand is why this is the case. Even for those people who desperately try to avert it, old age comes with frailty, disability, and ultimately death. Most find this prospect so frightening - scientists not excluded - that unravelling the aging process is not prioritised and has remained grossly understudied.
Today, life expectancy is continuously on the increase, we all want to age healthily and for longer. Therefore it is key to understand how our bodies accumulate damage over time, become increasingly dysfunctional and in need of repair. We urgently need new knowledge to help us to prevent or at least delay these degenerative diseases.
So far, scientific explorations into the biology of aging are predominantly performed using worms, flies, and rodents. This is not because it would be unethical to perform aging research on humans, but experimental models have conveniently short life spans. Outcomes of experiments can be achieved within weeks to months, whereas human beings would have to be studied for decades before conclusions can be drawn.
The practice of scientists using animal models confront us with a pressing problem; which knowledge of worms, flies, and rodents also hold for humans? It is far from easy to extrapolate the basic knowledge from animals to clinical findings in older people who suffer disease. To make this causal inference, a new dawn of bioinformatics, using computers to handle ‘big data’ may provide us with a way out.
Abundance of data in Denmark
Denmark and other Scandinavian countries have access to an abundance of data on the entire population by linking registers that date back to the 1960s.
Over the past decades, there has also been significant progress in establishing electronic patient records, where numerous patients have left a biological specimen – for instance a blood sample or a tissue biopsy – in one of the national archives.
The direct personal goal is to serve the individual patient’s need for making a diagnosis and to guide treatment. Linking biological specimens with the population registers in Denmark using the unique personal identifier provides an unprecedented opportunity for research.
The ability to handle enormous amounts of data using computer sciences can deliver a comprehensive understanding of human health and disease. It allows testing the various biological mechanisms operating in humans. Computer science holds promise of a great new source of information and tools to improve prevention and treatment of degenerative diseases.
This is exactly what the CHALLENGE-project is about. This year we - the Center of Healthy Aging - initiated a six-year long research project funded by the Novo Nordisk Fonden using personal data from Statistics Denmark along with biological specimens from Rigshospitalet. The consortium will ultimately shed new light on the relationship between aging and disease.
Ethics Is Key
Few would argue against using historic data and past patient histories to improve outcomes for patients today. But there are increasingly ethical concerns on the use of personal data for scientific research. It must be carried out in such a way that the intrusion of people’s privacy is kept to a minimum and is proportionate to the benefits for individuals and society as a whole.
In an era where the use and abuse of personal data is on the rise, trust in those who explore personal data is eroding. The public is increasingly concerned about a lack of privacy, and losing control of their own data. It is obvious that the use of personal data for scientific exploration is only sustainable, when it balances the rights and interests of the individual with that of society as a whole.
The CHALLENGE-consortium strives to increase awareness of how scientific research can be carried out in such a way that minimises intrusion of people’s privacy. As part of that ambition, we’ve created the website dataforgood.science to explain why and how we work and address the ethical dilemmas that we are facing as we go along.
There are two categories of personal data: non-sensitive personal data, and sensitive or special categories of personal data. Non-sensitive personal data include names, addresses, dates of birth, and phone numbers. Special categories of personal data are considered sensitive and include ethnicity, political opinion, ethnic background, religious and philosophical leanings, affiliations to workers unions, health, sexual orientation, and biometric and genetic data.
The distinction in data is important, as the recently introduced European Union General Data Protection Regulations (GDPR) prohibits handling and processing sensitive personal data. This prohibition can only be set aside, if explicit consent is given by the person, or if national or EU legislation allows for the processing of these data.
The essence of the CHALLENGE-project is therefore that personal data are only handled within the firewalls of Statistics Denmark by authorised experts. The personal data will never leave the offices of National Statistics and never be shared with others. Only the results of this research will be shared with scientists, doctors, and patients alike. In that way, we can pass on knowledge without passing on individuals’ data.
As an example, when confronted with a new patient who has developed a degenerative disease, the doctor can compare that single patient’s data with the characteristics and outcomes of a large group of similar past patients and use this comparison to optimise medical treatment.
The CHALLENGE-way of working shows that scientific exploration of personal data can be performed in an ethically responsible way while individuals keep control of their personal data.