The World Economic Forum and the Government of Finland, in collaboration with Edelman, Splunk and the Patrick J. McGovern Foundation, have developed a new approach to data management that will improve the way extremely sensitive data is used for the City of Helsinki, which is one of the most advanced cities globally in terms of the adoption and use of digital tools and data.
The Finnish capital is pioneering when it comes to delivering services and accessibility. Collaborating with the Forum, the city authorities have developed an innovative blueprint for using the power of data and analytics to stay one step ahead and proactively improve the lives of residents.
Helsinki has been able to implement this blueprint to separate the storage, anonymization and processing of data from tasks performed by individuals. The objective was to safely provide all city residents with new, personalized and targeted services when they need them, reducing inefficiencies and bureaucracy.
We’re motivated to develop real tools for on-the-ground action because human-centric and society-serving approaches to data are not just a ‘nice to have’, but the foundation for thriving societies.
Sebastian Buckup, Head of Centre for the Fourth Industrial Revolution Network, WEF
This approach has radically simplified the process for parents of preschool-age children who are looking for care in suitable daycare centers across the city. In January 2021, over 5,500 families received text messages suggesting pre-primary education places for their children and nine out of ten families accepted the offer.
New technologies, such as artificial intelligence, internet of things and the metaverse, demand data as the foundational resource for solving systemic challenges, from pandemic response to climate change. Yet despite an abundance of both supply and demand, the evolution from data to insight still presents many challenges.
On the one hand, data often remains isolated within territorial boundaries and corporate environments and is unavailable to benefit people, society and the planet. On the other, the type of governance needed to assure proper oversight, transparency and accountability by those using data is still being understood.
As the data universe expands, it becomes exponentially more complex, requiring solutions that integrate political, economic, social, environmental, technological and, most importantly, human aspects.
During a four-month sprint, new pathways, processes, and tools were created to document the best practice blueprint, which would inform and improve future use. The Helsinki experiment was defined as follows:
- A modular process for utilizing personal data when developing city operations
- A generalized, formal data request template that allows the data controller to make an informed decision on authorizing data use
- A generalized schema for a hybrid cloud computing environment for anonymizing and analyzing personal data. This complies with current regulations and policy on processing sensitive personal data in a cloud environment.
- A service design model for integrating insights from this white paper into practice, as part of data-based proactive services
The main outcomes have been outlined in the report “Empowered Data Societies: A Human-Centric Approach to Data Relationships”
Outcomes
- The Helsinki Process for Utilizing Personal Data is a management tool created to enable efficient data utilization. It defines stakeholders, roles, goals, and tasks when utilizing data, from the introductory stages to setting up the data environment and sharing results. The process defines the steps of a data-based operations development process. In addition, for each state and transition, the common roles, goals, and tasks are defined. The framework outlines the data utilization process, enables mapping of status and required actions, and improves communication between stakeholders. To support this process, we created a data request template and a service design model to engage customers (service providing teams in Helsinki) and ensure personal data could be utilized in a responsible, human-centric way.
- The Helsinki Template for Data Request was developed to help plan the data utilization process. Within the template, the potential data user describes the analysis (or other use of the data) and the controller describes the data. This tool covers all aspects of a data utilization process required for the controller to make an informed decision on authorizing data use. It functions as a basis for a project plan and documents the permissions granted on data.
- The Helsinki Personal Data Hybrid Cloud Architecture is a new, technology-agnostic blueprint for a GDPR-compliant hybrid cloud. It offers a veil of anonymity for maximum data privacy, allowing efficient utilization of extremely sensitive data. The main principle behind the blueprint is that the storage, anonymization, and processing of data are separated, and that different individuals perform each task. The data centre has a data lake-like unit for storing sets of personal data and for each anonymization task a separate anonymization server is set up. These servers are temporary and are cleaned and returned to the pool when no longer needed. The anonymized data can then be pushed to the cloud. If weak anonymization is performed, or the level of anonymization cannot be adequately assessed, data processing can be performed on another temporary server for added protection.
- Helsinki Anonymizer is an open-source software toolbox, developed for anonymizing unstructured or weakly structured data. “Anonymization” is a common buzzword in the data industry and various definitions are used. Helsinki follows the Finnish Social Science Data Archive66 definitions for anonymization. There are tools for GDPR-compliant anonymization of structured data, such as Openaire Amnesia, but the anonymization process for non- structured data (contained in free text fields, voice and images) is more complicated. The Helsinki Anonymizer is a collection of tools
and practices for anonymization that are easy to adopt and can be used in a variety of projects. - The Helsinki data-based service staircase is a modular tool for project management, aimed at developing public services. It is based on the principles and insights presented in this paper and provides a framework for enhancing services using data, by incorporating the principles of empowering the customer – earning and deserving their trust, and serving them proactively. Within this model, data utilization is categorized into five levels: (1) human operator with restricted capacity and access for data utilization (this is where most services are at the moment); (2) local restricted automation such as chatbots and automated handling of forms and applications; (3) federated learning, a decentralized computational scheme that enables complex personalized services with great privacy; (4) centralized modelling, (i.e. the standard data utilization process where data is collected in a centralized registry and processed jointly); and, (5) research and development or scientific use.
Each of the five levels is independent of the others, but the risks and benefits increase when allowing data utilization on higher levels. Local models enable faster service, independent of time and location. Federated learning allows multiple users to benefit from one other, using highly complex and personalized models without actually sharing data itself. Centralized modeling allows for maximum utilization of a single user’s data, and scientific use allows fast and responsive development of services and democratic decision-making.
Helsinki provides an example of a great start, but the long road to fully empowered data societies is anything but certain. The delicate balance between enabling innovation and respecting rights and agency is constantly being tested, as much through newly proposed regulation as through popular advocacy, corporate practices, and individual decisions.
Many factors contribute to finding the optimal combination: establishing trust between people and organizations (data subjects and data collectors), mixing legislative foundations with respectful and conscientious best practices, looking at complex data journeys and understanding them holistically, and incorporating moral and ethical considerations into the design of data-based solutions. Putting it all together is not easy, but the failure to do so could lead to a data dystopia: eroding trust in everyday digital interactions and minimizing, rather than encouraging, the creation of data. Data which, if transformed into socially minded insights, could be the key so urgently needed for resolving some of today’s most crucial challenges.