Unicorn–Open science for assessing environmental state, human health and regional economy

Pekka Neittaanmäki; Timo Huttula; Juha Karvanen; Tom Frisk; Jouni Tuomisto; Antti Simola; Tero Tuovinen; Janne Ropponen

doi:10.3897/rio.2.e9232

List of participants

Consortium leader (PI):

Prof. Pekka Neittaanmäki, Dept. of Math. Information Technology, JYU.

Group leaders:

Prof. Juha Karvanen, Dept. of Mathematics and Statistics, JYU (Secondary PI)
Prof. Tom Frisk, Pirkanmaa Centre for Economic Development, Transport and the Environment
Adj. prof. Jouni Tuomisto, Dept. of Health Protection, National Institute for Health and Welfare
Adj. prof. Juha Honkatukia, VATT Institute for Economic Research
Adj. prof. Timo Huttula, Finnish Environment Institute (SYKE

Third parties involved in the project

Associated partners:

Anne Mäkynen, Council of Tampere Region
Laura Höijer, Ministry of the Environment
Jyrki Huikari, Ministry of Social Affairs and Health
Jaana Husu-Kallio, Min. of Agriculture and Forestry
Peter Tattersall, Hahmota Oy
Antti Poikola, Open Knowledge Finland Ry

State of the art and preliminary work

This research project will be a game-changer in political and governmental decision making by enabling the use of scientific information, which is accessed and analyzed in a unique framework containing multimodal data from different sources and most novel data analysis tools to interpret it. The aim of this project is to provide efficient tools that will work for both decision-making processes (Reichert et al. 2015) and research problems. The tools enable data mining from various sources, data analysis and processing, comparison and simulations using the extracted data.

We are in a situation where a lot of good quality scientific data is openly available in national and international databases (avoindata.fi, www.data.gov, thegovlab.org). Also there is a clear societal strive to provide and use open source models and evidence-based decision making and support their open use and development.

A major problem is that there is no single physical or virtual place where all this data can be combined, preprocessed, analyzed and visualized easily by researchers for decision makers. The data preprocessing routines are laborious, and it is typical that every researcher and software provider develops their own unique routines for this to facilitate their own work. However, the most important societal needs require collaborative multidisciplinary attempts to solve relevant problems or questions. The processes must be reliable, reproducible, and transparent to support these studies effectively and efficiently. Shared practices, tools, data, working environments and concerted actions are the way forward to improve science and decision support. This is true in all areas, but in this project we will start from environment, human health, and the regional economy, as they are complex and challenging enough to offer a good test bed for general development.

It is not enough that experts push data to politicians. There must be practices for mutual communication: experts must answer policy questions in a defendable and useful way; decision makers must more clearly explain their views using evidence; and there must be ICT tools to support this exchange. The focus is on end-users. This consortium has already developed and tested prototypes of such practices and tools in several projects, and is now ready to apply them in the society on a large scale.

The partners of Unicorn are highly competent in their respective fields in handling data at every stage of a decision process and in producing useful, timely, and accurate information for the decision makers. The results of such work have to be understandable and traceable back to original data: no black box solutions. Unicorn can provide that and also critical comments and recommendations during the work.

The consortium relies on latest methods and tools from statistics, data processing and simulation. Some researchers will focus on methodology of the Unicorn environment. Mathematical and statistical methods will be tailored and tested inside the environment. Methods for processing information and big data will be considered from theoretical and practical points of view. Skilled people will construct the interfaces between e.g. databases, preprocessing tools, mathematical models, and post-processing tools. The utility of the environment will be demonstrated with case studies. The environment will not be a single product in traditional sense, but a systematic and coherent set of tools and practices, useful for everyone in different ways.

In this project, we have chosen open policy practice as the basis for the work. Open policy practice is an existing method for decision support (Pohjola 2011). It has been used e.g. in THL in several cases. It has shown to be a flexible method, and implements many important properties of good scientific and policy processes.

Unicorn environment builds upon already existing projects, models and case studies at national and international level, complementing and improving on the knowledge generated by them. Most of these projects involve several members of this consortium, which will ensure that potential interactions between projects might be more easily established. For example, Opasnet, the web-workspace for performing open assessments and facilitating open policy practice will be used. Also, the present project has a close connection with the ongoing CONPAT project funded by the Academy of Finland AKVA research program, where investigation of health effects and economic implications of microbial and chemical contaminants in the Kokemäenjoki River watershed downstream from City of Tampere are already underway. Other existing tools will also be utilized in this project, such as www.jarviwiki.fi and www.vesinetti.fi.

In summary, proposed project produces a new, open virtual work and modeling environment that combines open information from multiple databases and builds up tools for efficient policy studies. This will improve decision-making processes in Finland and other countries and greatly streamline the workflow in multi-disciplinary projects. Previously, similar projects have been done in more closed and less reusable settings. Especially georeferenced data will be utilized in more efficient ways in several case studies.

Open data and models are a mega-trend and will change the world. Unicorn directs this trend to paths that are the most beneficial for societal decision making by providing quick, reliable and efficient decision support. This improves the Finnish political and hence economic infrastructure. Significant saving of resources will be manifested with improved data collection, analyses and modeling. Also, the quality and amount of assessments that can be done to support work.

Relation to the work programme

Objectives, Concept and Approach

The objectives of the project are:

To develop a versatile open web environment for decision support and related data storage and modeling.
To develop interfaces between existing data sources and models and the Unicorn environment to facilitate their universal use.
To provide and implement tools for producing new open datasets and models to complement and expand the system.
To develop a user forum and community of researchers, developers, policy makers, and stakeholders for creating shared understanding about policy-related science.
To modernize the working practices of researchers and decision makers e.g. by improving their use of existing data and models.
To test, implement, and demonstrate large, multidisciplinary studies utilizing novel, efficient practices and environments.
We hypothesize that the major challenges related to evidence-based decision making actually are about changing the practices of researchers and decision makers. However, the change is slow and discussion feeble until there is practical tools (such as the Unicorn environment) to support and demonstrate the new practices.

Scientific breakthroughs and progress seen

Unicorn has top modelers and researchers from several different fields of research. In addition, they already possess unique expertise and functionalities that can be directly used for developing the Unicorn environment. In this very consortium, there is a high potential for synergism by combining different data sources, models and platforms, and decision support practices.

The environment will be adaptable to other disciplines, methods will be flexible, and assessments can be checked and re-used anywhere in the world, thus increasing reliability and penetration of knowledge. It has a potential of becoming a standard for decision support expert work. The government research institutes such as SYKE, THL and VATT have a specific interest in long-term maintenance of an environment that supports their main objective. For publication plan, see the Interaction plan.

Exporting of knowledge of best practices

Finland is famous for producing many crucial information-related products that are now used worldwide as de facto industry standards: Linux operating system, SSH encryption protocol, MySQL database, SMS text message. All these breakthroughs were developed in Finland and distributed freely for anyone to use. These products have changed the global market of their respective fields. They have been game changers that forced everyone to rethink what an operating system or database means. They have also created a rich ecosystem of service providers on top of the key innovation, rather than producing direct sales income.

Our aim is similar. The revenue comes from selling expertise or knowledge services enabled by the Unicorn environment or its derivatives. Free distribution of the main innovation is actually crucial in spreading the idea to worldwide use and in creating global demand for the product. If the Finnish private sector is ready, it will get the first shares of the new market. However, it should be remembered that the societal benefits of implementing open knowledge practices are larger than the incoming cash flow stimulated by them and materialize as quicker and more justifiable societal actions.

Introduction

We will develop the Unicorn environment using flexible exploratory methodology and design science. During the 6 years of the project, we need at least two development stages. The architecture must be modular to provide flexibility with interface generations. The first stage focuses on reusability of components, while the second focuses on efficiency.

An open map service will be a key functionality for using location-based information and showing results on a map. The project will complement existing data with focused experiments. A geographical pilot will be the waters downstream Tampere, where we focus on water quality, accidental releases, their health effects and risk management, and the regional economy and air quality. The background studies have been made in the CONPAT- project funded by Academy of Finland.

Other pilots are in Kuopio region in Nothern Savo, where we looks at exploitation of natural resources and its unforeseen risks and in Kainuu region near the Talvivaara site, where SYKE and JU have existing aquatic prjects (e.g. MINEVIEW). VATT has produced long run structural economic projections for each NUTS 3 (Nomenclature of Territorial Units for Statistics (NUTS) is EU standard for referencing the subdivisions of countries for statistical purposes. There are currently 19 NUTS 3 level region in Finland (maakunnat)) level region of Finland. The work is implemented with regional computable general equilibrium (CGE) modeling techniques and it uses dialogue with regional experts to exploit their tacit knowledge, e.g. about working population trends.

Research materials and their management plan

The issue with material management in the and after the project will be the most core of the project. The Unicorn environment has to provide robust and efficient queries using many databases and in same time, there are restrictions and rules for the data available. Meanwhile, all the studies should be reproducible. Moreover, even the aim is to use open approach, the privacy setting and authentication during the research-phase is needed. The aim is to tackle on these difficulties when requirement specification will be implemented.

We will utilize several existing databases and platforms. Opasnet workspace, designed for decision support, has open models on e.g. burden of disease, air pollution, and contaminants in food. These will be utilized and further developed in real policy cases (see e.g. WP5). Also other cases will be implemented; tentative topics include radon and dioxins where data is already available. An overall objective is to produce a holistic burden of disease model covering all major environmental and lifestyle factors in Finland; however, only a part of this will materialize within this project.

RYMY is a national water- and foodborne outbreak notification and reporting system maintained by National Food Safety Authority Evira and THL. Drinking water and bathing water monitoring data comprising about 120 000 data points per year will be collected via this reporting system tentatively beginning from 2016. We will assess quality of drinking water together with other relevant environment and health data. We will utilize surveillance and quality data for management actions and decision-making at different administrative levels. Data on environmental health exposure is made available through YHTI database. The VAHTI database is an emissions control and monitoring database of the Finnish Environmental Administration. We will use EU-INSPIRE Directive compliant nationwide open spatial data sets, such as Finnish Meteorological Institute’s open meteorological data, which are already being utilized by SYKE’s group participating in this project. Also National Land Survey’s relevant data products, like elevation data, are included, as well as SYKE’s lake depth data products.

The environmental databases at SYKE contain nationwide time series of hydrological observations including surface and waters and representative sites for ground waters. Similarly SYKE’s Hertta-system contains hydrochemical and hydrobiological data. The interfaces to this data warehouse are under construction and will be completed in the beginnig of 2016. Another important data source maintained by SYKE is the operational WSFS-Vemala-model (Huttunen et al. 2015). It produces hydrological and water quality forecasts in an operational way. The system has been used very much for flood warning and water resource simulations. It has been also a key tool in forecasting the fate and transport of recent accidental pollution releases from Talvivaara Mine and Nordisk Nickel Mine in Finland. Also for this system the interfaces are there and system will be connected to Unicorn as a data source. One important data source is provided by ENVIBASE platform.

Health impacts are like a pyramid: severe rare cases on top and milder common symptoms at bottom; severe cases are more likely to be recorded. At THL the surveillance pyramid has (i) deaths registered by the Statistics Finland, (ii) Hospitalizations registered by the THL to the Hilmo register, (iii) microbiologically confirmed case registered by the THL to the National Infectious Disease Register (TTR), and (iv) visits to doctor in the primary health care, registered by the THL to AvoHilmo register.

VATT models are based on detailed national input-output tables (Honkatukia 2013, Honkatukia and Simola 2011) and national and regional acount system time series starting from 1975, which are provided by Statistics Finland. Official regional input-output tables have not been created since 2002, but the VATT modeling unit creates them with a gravity-based method for all the NUTS3 level regions. The created regional input-output tables are distributed to other research organizations at the aggregated level determined by Statistics Finland. The model parameterization draws on several sources such as Finnish Longitudinal Employer-Employee Data (FLEED), Structures of Earnings, Population Structure, Population Projection, and Statistics on the Finances of Agricultural and Forestry Enterprises (MMYTT). Additionally, the model baselines draw on Ministry of Finance long-term economic predictions and regional level expert assessments. The model results are used for anticipating long term occupational needs in order to better allocate educational resources.

Scientific and technical challenges

Societal challenges

Sustainability model

Impact

In this project, we are developing the Unicorn work environment and practices as a more efficient way to process large amounts of varying data using the latest methods and tools. The aim is to utilize the current data and technology revolution with its full potential. Transparent decision-making process and openness is a megatrend. Data and mathematical methods should be open and re-usable by researchers and policy experts. This will be a key means for governments to discuss and construct trustworthy opinions.

The technology revolution of open data and open models will completely change the way we think about evidence-based decision making (Reichert et al. 2015). Instead of a traditional long chain of static information products such as scientific articles, reviews, expert reports, policy papers, and finally decision recommendations, we can re-think this process as collaborative information collection work. The objective of the process, such as a law about a particular topic, acts as the starting point, and is published on a collaborative environment. The environment is then used as a forum for discussion, assessments, comparing modeled impacts, developing solutions, and finally, based on this write the content of the law. Others can later revisit the proposed law and its rationale.

This process is much more complex than a typical collaborative writing task. Indeed, it is based on large assessments about impacts of policy options and thorough expertise on the underlying questions, both scientific and value-based. Still, we have been able to identify critical rules and practices for such a process, and to develop tools to support the practices. We have also developed tools to collect estimates from experts for computational models. Although the initial development of models may be laborious, they are designed to be re-usable within the environment in other, similar cases.As this practice becomes more prevalent, we expect that the society will start demanding better evidence to back up a decision before it is socially accepted. In other words, it reduces the survival of poor policy initiatives.

Concrete manifestations of the technology revolution and the benefit to Finland

Open data and models will change the world and Unicorn directs this trend to paths that are the most beneficial for societal decision making by providing quick, reliable and efficient decision support. The government research institutes spend hundreds of person-years in old-fashioned data analyses and modeling efforts in decision support. Significant saving of resources will be manifested with improved data collection, analyses and modeling. Also, the quality and amount of assessments that can be done to support work in e.g. municipalities is significantly increased.

The virtual environment will be initially set up with ecological, health, and economic aspects (including complex interactions and spatial data). However, the system will be applicable to other sectors and we expect expansion of use. The virtual environment has a great potential in Finland and also exports value: economically, we see a market for expert-based consulting and modeling in all societies utilizing technological break-troughs.National databases make it possible to create an assessment and simulation tool that can be used for answering local questions, such as mercury in a local lake (see WP5). This would be an improvement to aquatic modeling and leading also improvement to the current general-level recommendations of fish intake, which may partly lead to unnecessary avoidance of fish in the diet or anxiety.

Human activities, institutions, and behavioral changes that are needed to exploit the tech. revolution

Unicorn partners already have practical experience in virtual environments and online modeling. There are technical challenges like user and data exchange interfaces (see WP3 for our solutions). However, our experience has been that cultural and learning challenges are clearly larger: e.g. experts do not often accept the principles of openness of data and criticism by non-experts; there are worries about merit accumulation; and open programming languages are not familiar. Constant communication, positive examples, technical support, systematic small-scale testing and incremental improvements, and political support from the employer and research funders can overcome these challenges. Also the policy sector has its challenges: non-familiarity of assessments and scientific data and lack of resources and time in single policy processes. The previously mentioned methods apply here, but in addition, experts have to carefully listen to the information needs of decision makers. See WP2 for our solutions.Public measures to best support the process of change in such a way that the transformation proceeds in a controlled manner, and end with Finland to benefit technology revolution.

Public measures to best support the process of change in such a way that the transformation proceeds in a controlled manner, and end with Finland to benefit technology revolution

The consortium will work closely with the governmental stakeholders in development and utilization of the Unicorn environment and open developer forum. Pirkanmaa Centre for Economic Development, Transport and the Environment, key actor in its region, will coordinate collaboration with Stakeholders. There are also committed associate partners to collaborate with consortium. They include ministries of environment (YM), social affairs and health (STM) and agriculture and forestry (MMM), National Land Survey of Finland and National Supervisory Authority for Welfare and Health (Valvira), Council of Tampere Region, Open Knowledge Finland Ry and Hahmota Oy. We are also actively collaborating with Kuntaliitto (association of municipalities) in this area. We will actively spread information and recruit new associated partners during the course of the project. Especially we are interested to find them from other geographical regions in Finland.

The Unicorn consortium will utilize the data resources generated during the project as part of routine legal requirements of environmental monitoring and public health response. For example, local general practitioners in all municipalities of the country report the citizen health information to the AvoHilmo database. In the near future, also local health protection authorities or local laboratories will submit their data to the YHTI environmental health-monitoring database. In the central government is currently going to a number of different projects with information resources sharing to develop. The project will support the development of co-ordination and aims to utilize resources effectively by various institutions. The promotion of the use of models provides significant advantages in the production and utilization of information.

Science is powerful in rejecting ideas that are not consistent with observations. Therefore, science should be actively used to estimate what impacts the actions considered could or could not have. Ineffective actions are rejected; uncertain actions are tested in small scale, and then poor actions can be rejected. Thus, the role of experts is to reject poor ideas, and the role of decision makers is to choose among the remaining good ones.

Equal adaptation of capabilities and human resources in the individual, group and institutional level for the system reform

Experts' know-how is exported as much as possible in the form of automated tools to enable more efficient use of time and resources which can be better targeted to make reliable estimates instead of mechanical data processing. People think that they have a right to be heard and their opinions and concerns to be acknowledged. Often conflicts in a society occur because some group thinks they were not heard and they cannot influence their own case; then they lose trust. Shared understanding is a systematic method that has been developed and implemented in THL. It offers a channel for citizens to be heard also in this project.

Implementation

The project implementation is conducted in seven work packages (Fig. 1).

Figure 1.

Links and interactions between the work packages

WP1. State of art review and inventory of the resources available

Responsible group leader: Prof. Pekka Neittaanmäki (Dep. of Math. Inf. Tech. JYU)

Research Group: Ph.D. Tero Tuovinen, Ph.D. Annemari Soranto

Schedule: M1-M5, M37-M41, (5+5) Linking to: WP2

Description: In this work package we will identify the best practices of available open databases and models. Already publicly available projects, codes, interfaces and modeling frameworks that can help in the realization of Unicorn will be listed and studied. These include domestic databases and modeling environments such as THL’s Opasnet, and international codes such as the Open Modeling Interface, OpenDA data assimilation tools, OpenEarth tools and Earth System Modeling Framework. There are several large scale domestic ICT-networks and projects going on like VALTORI, ENVIBASE. They will be identified and the collaboration will be established when needed.

We are looking for the latest technological advances, especially from big data and data-analysis and integrate the system that combines latest knowledge of tools and data with easy usable user-interfaces. The work begins by defining the user requirements for the system (Unicorn) and planning for the implementation phase. The required human resources will be chosen after defining based on the knowledge that is needed. Especially human interaction with the system will be focused, because the idea is to spread the solution for large amount of researchers and decision makers. An extensive open data source mapping will also done. It will cover all the state research institutes for identifying the state of the art.

Tasks:

Reviews of the state of art.
In-depth familiarization requirement field.
Inventory of the databases available.
Preliminary selecting of the methods and the tools.

Deliverables:

A review about the methods and technologies that are openly available.
A review of integration interfaces of the databases.
A review about model – data interface standards.
Plan for the next steps.

WP2 Stakeholder contributions and dissemination

Responsible group leader: Research Prof. Tom Frisk (ELY Centre Pirkanmaa)

Research Group: Prof. Pekka Neittaanmäki/JY, M.Sc. Ämer Bilaletdin/ELY, MSc Matti Saura/ELY, Ph.D. Tero Tuovinen, Ph.D. Annemari Soranto/JY, Saija Koljonen/SYKE, M.Sc. Antti Simola/VATT, Kati Valpe/JY

Schedule: M1-M72 (27+36) + (9+12)

Linking to: All workpackages and partners.

Description: Unicorn will develop a unique environment for decision knowledge support. Anyone (company, research institute, university, or parliament) can use it or set up an instance of their own for their own purposes or for distributing their own knowledge to others. This will produce global demand for expertise about the Unicorn environment and practices, and thus dissemination needs (see WP2).

In this workpackage the consortium partners will provide a synthetic view of overall achievements of their multidisciplinary research results and update it during the course of the project. The research outcomes will be used to support the decision making of the stakeholders. Our main role is to identify the benefits of involvement of the stakeholders, to identify appropriate stakeholders and the ways to work with them, to inform about the scope of the research and to share information, and to choose the best techniques for the engagement of stakeholders. The first, and perhaps the most critical, step in the stakeholder engagement process is to identify why the engagement activity is necessary, what outcomes are aimed at, and the scope and the context of the engagement.

Dissemination activities will be focused on two major aspects of the project. Firstly, dissemination through the events, like seminars, workshops and information days, organized by the consortium, and secondly, sharing of project outcomes. Both of the aspects will be implemented through a variety of means during the project. One method to do dissemination will be project websites, where we will collect all the necessary material about efficient use of Unicorn environment. The website will archive externally and internally accessible material, such as presentations, minutes from network meetings and workshops and progress reports. Scientific and technical results will be disseminated to the wider scientific audience throughout the entire duration of the project and beyond by presenting relevant results in scientific journals and conferences. Incorporating the main results of the project in regular university courses will also do dissemination.

Furthermore, the aim is to organize several events that support the gaining and sharing the information. We will do the small workshops, where we will invite 2-5 specialists on the topic on focus to present and discuss about the latest advances and possibilities. Moreover, we will organize yearly seminar that bring the collaborative network together and update the status of the project and overall. Finally, we will organize final symposium and by providing a common framework of design including all the models developed in the network.

The one way to do dissemination during the project will be organized training events. We are interested in the user experience and usability of the environment. Organizing the training events, we will get feedback and first-hand information about usability of our solutions. Moreover, it will increase the number of end-users and interest to all this system will increase.

Tasks:

To identify the benefits of involvement of the stakeholders
To identify appropriate stakeholders
To identify the ways to work with stakeholders
To share information to stakeholders
To produce learning material and training for open policy practice.
Facilitating public decision processing by e.g. organizing and synthesizing public discussions.
Producing project websites
Organizing event

Deliverables:

Dissemination plan
Project websites
The seminars, workshops and final symposium
Sharing information with stakeholders.

WP3 Implementation of the Unicorn environment

Responsible group leader: Prof. Pekka Neittaanmäki (Dep. of Math. Inf. Tech. JYU)

Research Group: T. Tuovinen, A. Soranto, P. Korhonen/SYKE, M.Sc. A. Simola/VATT, J. Tuomisto/ THL

Schedule: M6-M36, M37-M72, (31+36)(36+36) (144 mmonth) Linking to: WP1, WP2, all other WPs.

Description: This work package is the practical and concrete core of the project. In this package, spanning nearly the whole project duration, we will implement and program the Unicorn environment. In WP1, we define the overall approaches for the implementation (tools, methods, targets) based on feedback from other work packages, especially WP1 and WP2. We will build up the requirements for the system based on the overall objectives and using open discussion. The key functionalities in this package are planning and design of concrete components and structures, building up the basis, programming of the routines, implementing of functionalities, verifying the results and solutions and finally optimization of Unicorn environment for research use. We will utilize existing open source solutions when available. The implemented solution will be measured by its reliability, usability and efficiency. Recommendations for after-project development will be described and documented. All solutions that are used will be openly documented.

In this package we will implement interfaces for open databases and links between the Unicorn environment and several model codes. Moreover, we will build up a test bed and challenge the environment for several developed Big Data analyzers. The results will be analyzed and documented for later use. During the project’s first phase, the aim is to build up demonstrator level environment. In the second phase, we are focusing on efficiency and usability of the Unicorn. Our target is not a fully commercial software package, because the production phase will be too time and resource consuming. However, companies can easily utilize major parts of the implementation after the project because of our open approach. Moreover specific problems and solutions will open markets for the business.

Tasks :

To do requirements definition.
To do designing of the structure of Unicorn environment
To implement of the required components.
Build up a user interface
Build up a models and analyzers interfaces.
To integrate the Unicorn to the databases.
To validation and verification of the system.
Write documentation.

Deliverables:

Requirement definition.
Design document of the system.
Implementation plan.
Integrated environment.
Validation and verification report.
Documentation

WP4 Unicorn pilot case case: demonstration related to the environment

Responsible group leader: Adjunct Prof. Timo Huttula

Research Group: PhD. Janne Juntunen, Dr. Tech. Olli Malve, M.Sc. Janne Ropponen, PhD. Saija Koljonen, M.Sc Niina Kotamäki, M.Sc. Esa Hirvonen, M.Sc. Päivi Korhonen,, Ph.D. Antti Simola/VATT

Schedule: M1-M36, M37-M72, (100+80)

Linking to: WP3, WP5

Description: There is a strong pressure to migrate from environmental monitoring to environmental modeling because of a need for both more efficient use of resources and more relevant, nationwide results. The goals of this work package are:

To evaluate the applicability of environmental data bases for real world aquatic applications, to test analyzing tools developed in WP1 for extracting environmental information and correlate that with information extracted from other databases.
To assess the functionality, accuracy and usability of aquatic models built within the Unicorn environment compared to manually building the models. We will compare the solutions provided by Unicorn to the solutions from already performed in traditional way in Academy of Finland Funded CONPAT (http://en.opasnet.org/w/CONPAT) project focused on Lake Pyhäjärvi and River Kokemäenjoki. This study will show how well the parameterization and all necessary input information can be automatically created for the computational model. Moreover, it will show the potential time savings that can be gained by using the Unicorn environment compared.
Demonstrate the usability of Unicorn to new, previously unmodelled cases. They are:

Accidental release of harmful substance on Kallavesi- bridges, where risks on raw water supply of Kuopio city and economic loss due to the fishing capacity losses are studied.
Fire in chemical plant at Tampere City, atmospheric deposition and risks on human health and economic effects in the region,
Effluent transport in recipient waters of Talvivaara mine, their health and economic effects,
Piloting regional applications of lake specific models to support environmental management planning and prioritizing of restoration or other measures.

We will show that by using Unicorn we are able to build a usable hydrological or water quality model or chain of models with first guess parameterizations, and can produce reasonable results without extensive model tuning. This enables us to study rapidly developing situations in previously unmodelled areas. The models utilized range from a box model (LLR, Kotamäki et al. 2015), 1-D a river model such as (e.g. SOBEK, Ropponen and Huttula 2014) as well as a 3-D transport model (COHERENS, Lyuten 2011) will evaluate the need to implement a process based catchment model (e.g. INCA, Granlund et al. 2004, SWAT, Tattari et al. 2009) within Unicorn to solve hydrological scenarios with high spatial accuracy.

For flow and transport model input we will use bathymetry, DEM (digital elevation model), hydrological and meteorological data as well as data on the simulation state variables that are already available in some form at various sources. For example, the operational hydrological system WSFS provides simulated hydrological data and forecasts (water levels and discharges) from all river and lake systems in the country and can be used as input for transport modeling if observational data is unavailable.

Reproducing existing, manually crafted models within Unicorn does not show the true potential of the system since prior knowledge of the challenges encountered during model development will be taken into account when implementing the model setup. Therefore we need complementary demonstrations to assess the capability of the Unicorn environment. We will use Unicorn to model a new area, the lakes in Talvivaara region. Another pilot site will be Lake Kallavesi, which is used as the raw water source for making artificial ground water for the City of Kuopio. We have previously shown that even a modestly calibrated lake model combined to a simple data-assimilation scheme clearly improves the prediction capability of the model compared to using the model or data alone (Mano et al. 2015). The Talvivaara mine complex is facing great challenges concerning of water balances and accidental releases of wastewater is possible as has happened in past.

Furthermore we are able to use the long term monitoring data for both chemical and biological scenarios based on the hydrodynamic models. This will enable us to respond swiftly to diverse environmental challenges (chemical fate models) and it will also work as a tool for directing environmental measures (e.g. prioritization of restorations).

Tasks:

Demonstration and applicability tests of Unicorn in assessing the environmental state.
Transfer of environmental modeling know-how. Analyzing the existing model applications in order to transfer the modeling know-how in the form of automated tools and model applications to Unicorn.
Demonstrations cases. Produce a model application in the CONPAT-region and two other regions.
Evaluation of the case studies. Comparison of the results produced within Unicorn with existing models and data.
Optimal utilization of environmental data. To find the optimal way for the models to utilize environmental data provided by Unicorn from different sources. To identify and solve issues arising from differing spatial and temporal resolutions of data and models.

Deliverables:

Model realizations. Using Unicorn to build the case studies.
Simulation results.
Scientific publications related to manual and automatic modeling results.

WP5: Unicorn pilot case: Demonstration related to human health

Responsible group leader: Adjunct Prof. Jouni Tuomisto (THL)

Research Group: Jukka Jokinen/THL, Arja Asikainen/THL, Tarja Pitkänen/THL, Sari Ung-Lanki/THL, Mikko Virtanen/THL, Saija Koljonen/SYKE.

Schedule: M1-M36, M37-M72, (91+44 mmonth) Linking to: WP3, WP4, WP6

Description: The aim of work package is to demonstrate and evaluate the functionality, accuracy and usability of models built within the Unicorn environment in the pilot cases related to environment and health. Also, we will develop and implement practical end-user interfaces for citizens and municipalities. We have chosen three cases, within we will combine environmental exposure data and citizen health data into a total burden of disease model that can be used as a basis for further assessments and tools.

Methylmercury is a persistent environmental pollutant and neurotoxin originating from both natural and industrial sources and contaminating fish in lakes. Spatial differences are large, and therefore customized recommendations are valuable for health protection authorities and people eating fish. In Unicorn, we will open the large methylmercury data measured by SYKE during the last decades and develop an open online model for fishing recommendations. Also economic impacts will be explored, as the results may impact the reputation of some summer cottage lakes. Thus, this work will be done in close collaboration between THL, SYKE, VATT, local authorities, and stakeholder groups.

Indoor environment quality is an important health issue, as a large fraction of Finnish people suffer from indoor problems. This is also a major economic issue as exemplified by the 30-50 billion euro "renovation debt" in Finnish housing stock due to moisture damages. In schools the renovation debt is 3.7 billion euro, and this calls for action in municipalities. THL will produce an online indoor air questionnaire to schools and day cares including questions about students’ health and indoor environment quality. Interpretation guidance of these results will be produced based on existing reference material. Data collection will be conducted with an online questionnaire, and data will be nationally collected to Opasnet database. An automated reporting system for individual schools and municipalities will be developed. Monitoring of indoor environment quality is regularly done in schools by the health protection authorities. Unicorn will produce an online data collection interface that can be used to combine monitoring and questionnaire data, and possibly the air pollution data measured by the municipality, for school level decision support. This requires collaboration with THL and municipalities, and VATT for economic evaluation.

Drinking water causes 4-5 waterborne outbreaks annually (according to the national outbreak notification and reporting system RYMY), 20-30 cases of drinking water quality deterioration, and sporadic waterborne illnesses caused by Campylobacter, Giardia, and Legionella. Municipality waterworks make more than 100 000 chemical and microbiological analyses from drinking water annually, but the data is underused both in municipalities and nationally. The national YHTI database (containing municipality data on health protection) is actively developing the management of these data. Unicorn will take that data as a part of its modelling system using ReplicaX (see WP7) and analyse it against health data available in THL. The aim is to increase awareness and capabilities of statistical analysis possibilities and offer decision support to municipalities about preventive management. This is a close collaboration between THL, Valvira (National Supervisory Authority on Health and Welfare), municipalities, JyU and VATT (for economic impacts in WP6).

Tasks :

open the large methylmercury data measured by SYKE
to develop an open online model for fishing recommendations.
to produce an online indoor air questionaireto increase awareness and capabilities of statistical analysis possibilities and offer decision support to municipalities about preventive management.

Deliverables:

Policy assessment and open model about methymercury (M12)
Online indoor air questionnaire and database for schools (M30)
An open platform for making online health assessments (M36)
Drinking water database and open model for waterworks (M40)
An open total burden of disease model based on the open platform (M60)

WP6 Unicorn pilot case: Demonstration relted to national and regional economy

Responsible group leader: Research Director Juha Honkatukia (VATT)

Research Group: M.Sc., Antti Simola, N.N. Schedule: M1-M36, M37-M72,

Linking to: WP3, WP4, WP5

Description: We will demonstrate the applicability of Unicorn in assessing the linkages between environmental state and regional economy in Conpat-project region and two other regions. We will use existing VATT models and possibly yet to be specified open source statistical and CGE models in the Unicorn environment. VATT models rely on a detailed database that is unfortunately not open. Open, but less detailed version of the data will be applied in open source models when suitable in order to bring CGE techniques more available and open to decision makers.

As a background to these studies, the development of VATT models started in 1990s and has aimed at wide applicability in decision-making. Particularly the growing demand for quantitative policy analysis has ensured that VATT models have fulfilled the aim. Policy issues that affect several sectors or have opposing impacts are very often analytically intractable leaving computational analysis as the prominent way to do analysis.

The VATT models are computable general equilibrium (CGE) models of Finnish economy. The single country model VATTAGE (Honkatukia 2013) is based on the MONASH model (Dixon and Rimmer 2001) and the regional model VERM draws also on TERM (Horridge et al. 2005) and MMRF (Adams et al. 2003) models. Both paragons are developed by Centre of Policy Studies at Victoria University, Australia, and are widely applied internationally. The model development at VATT has especially concentrated on special characteristics of Finnish economy such as public sector functions and population age structure, which are depicted in detail.

One attractive feature of CGE models is that they conform to the national account systems. Thus the model results are interpretable in that context and the effects can be expressed as changes in economic indictors. Furthermore, the underlying input-output structure allows a consistent way to extend the economic analysis to material flow accounting. Consequently one of the main application areas has been interdisciplinary research with various collaborators. For instance, VERM applications include extreme weather events (Virta et al. 2011), regional wood supply (Honkatukia and Simola 2011), regionalization (Honkatukia 2013) and energy efficiency improvements (Airaksinen et al. 2015). In CONPAT consortium VERM is applied to socio-economic analysis of water related contaminants and pathogens.

The aforementioned experience in interdisciplinary research is a good starting point for more general approach of an automatically generated modeling tool. Aside of equilibrium modeling, the VATT researchers have also experience in econometrics methods that are frequently used in model parameterization.

Private sector use of open data is already extensive. Public sector lags behind mainly because it faces more complex problems – the required information does not concentrate on market segments of a single commodity but to a whole mix of industries in the economy, its demography, long term investments in infrastructure and planning of land use, policies countering the externalities and distributional issues. National account systems were created in order to convey consistent information on national economies. It serves as a natural starting point for CGE models and organizing open data that would benefit public decision-making.

Drinking water management.

We examine the Conpat study area related to management of waterborne outbreaks and regional economy in collaboration with SYKE (WP4) and THL (WP5). The economic effects include direct effects on labor productivity and regional trade balance, and indirect effects on consumer behavior. The former derives straightforwardly from production theory, and the latter from prspect theory. CGE modeling is a consistent way for evaluating both direct and indirect effects simultaneously.

Regional economic consequences of Talvivaara mining operations.

For this task we construct a simple regional CGE model, which can feasibly solve multitude of times in Monte Carlo manner in order to account for uncertainty in economic and environmental outcomes. We use VERM or an open source alternative. With this ex post analysis we can demonstrate how an equivalent ex ante analysis would contribute to decision making by more balanced information of risks and unwanted consequences. Theoretical focus is in political economy of regional development. This is a close collaboration with SYKE (WP4) and local authorities.

School renovations.

We use VATTAGE to assess short and long term consequences of neglecting the renovation of schools with indoor problems. With correct timing, the renovation investments could serve as stimulus. It also has potential long-term productivity effects that are not optimized by markets alone. Thus our approach yields valuable information for public decision makers. In the short run analysis we assess the stimulatory effects of renovations. In the long run analysis we apply recent demographic extension of VATTAGE for evaluating long-term economic costs of shortsighted decision making. This is a close collaboration with THL (WP5).

Tasks:

Evaluation of open source CGE models
Management of waterborne outbreaks at Conpat regions
Talvivaara case study – regional economic interests vs. environmental risks
Long and short term effects of renovation debt – what is good policy?

Deliverables:

Unicorn model realizations in cases studies
Simulation results
Scientific articles

WP7 Statistical methods, models and big data

Responsible group leader: Prof. Juha Karvanen (Dep. of Math. and Stat., Univ. of Jyväskylä)

Research Group: Jouni Helske, N.N.

Schedule: M1-M72 (124 mmonth)

Linking to: WP4, WP5, WP6

Description: Governmental institutes should publish their data as open data whenever not prohibited by confidentiality requirements. In practice, many datasets collected e.g. by THL contain sensitive information such as personal level health data. Naive anonymization, i.e. removal of names, addresses and personal identity numbers, is not sufficient to make the data publishable because it is often possible to deduce identity from multivariate data using e.g. age, place of residence, profession, language or medical history.

Synthetic data is offered as a solution for this openness – confidentiality dilemma. Synthetic data or data replica is created by means of simulation and so that the statistical properties of the replica closely resemble the original data. The individuals in the original data cannot be identified from the replica and therefore the replica can be published as open data.

Synthetic data offers new possibilities for the citizen science. The program codes developed for the replica can be applied with original data without any changes. The publisher of the data can therefore easily verify the analysis results with the original data. This enables an operations model where some parts of the data analysis are carried out by enthusiastic citizens (e.g. university students) and the employees of the governmental institute coordinate the work.

The concept has been already piloted: R code implementation ReplicaX by Juha Karvanen won the challenge “Utilization of health data” in Apps4Finland 2013 competition. As a part of the project, ReplicaX will be developed further, tested extensively with real data and put in full-scale production use.

In order to efficiently combine multiple databases and models, state-of-the-art statistical methods are needed. As databases contain data with varying levels of uncertainty (stemming from data collecting strategies, modeling choices and sampling variation, among others), different sources of information must be weighted accordingly. For assessing these uncertainties, a Bayesian framework can be used to combine expert opinions, multiple data sources and models in way, which gives easily interpretable results in form of probability distributions. This enables decision makers to make sophisticated forecasts under alternative scenarios.

High proportions of the big data stored today are inherently time series. When building generic models, taking account the time dependency in the data is crucial in order to make proper inferences of the results. E.g. in (Helske 2013), yearly nutrient fluxes of four Finnish rivers were estimated using state space modeling approach which efficiently modeled both the cross-sectional and time dependencies of the data.

Analyzing data with complex time and cross-sectional dependencies with varying sampling frequencies requires flexible models, which are robust enough yet still give meaningful results in realistic computational time. General purpose Bayesian modeling software such as OpenBUGS often requires considerable tuning of the estimation procedures, which are not well suited for time series data due to autocorrelation structures of the simulations relating to model estimation. Therefore it is important to build reliable and easy-to-use tools for analyzing various types of data from open databases. Similar but more restricted methods for forecasting uninvariate time series in frequentist framework were presented in (Hydman 2008) and (Hyndman 2015) and in broader scope in (Hyndman 2015, Durbin and Koopman 2012) and (Helske 2015).

Without proper software, analysts are forced to use their old methods whether they are suitable for the problem or not. Aim is to build an efficient and robust Bayesian modeling framework for an open source software R, which can be used to model multivariate time series data with complex patterns and varying sampling frequencies, taking account multiple sources of information and uncertainties related to data and model structure.

Tasks:

Development of ReplicaX
Extensive testing of ReplicaX in real use cases
Deployment of ReplicaX for production usage and integration with Unicorn
Development of BayesTime
Testing BayesTime in pilot cases and with synthetic data from ReplicaX
Integration of BayesTime with Unicorn
Collaboration with work packages 4-6 by providing statistical support for pilot cases

Deliverables:

Open source program code of ReplicaX
Scientific articles on ReplicaX and its performance with real datasets
Integrated production version of ReplicaX
Open source code for BayesTime
Scientific articles on theory and usage of BayesTime with datasets related to Unicorn

Budget

The annual costs are presented in Table 1. At this point the costs are not presented on the institutional basis.

Table 1. Download as CSV

Estimated UNICORN-project budget based on Finnish unit costs.

	Year 1	Year 2	Year 3	Year 4	Year 5	Year 6	Year 7	Total costs (euro)
Working time (m/m)	39	144	137	136	110	102	49	6976590
Travel (euro)	20000	38000	34000	38000	25000	35000	14000	204000
Material (euro)	2000	4000	3500	500	500	500	0	11000
Machines (euro)	25000	22000	20000	20000	17000	17000	17000	138000
Services (euro)	40000	37000	47000	54000	42000	32000	31000	283000
Other costs (euro)	10500	63000	15000	15500	13000	15000	13000	145000
Total (euro)	97500	164000	119500	128000	97500	99500	75000	7757590

Timeline

Main actions during the project years

The main actions of project during the project years are as follows. A more detailed time char on the task level is presented in Fig. 2.

Figure 2.

Time line of the tasks in UNICORN-project

Year 1: Identification of different groups of stakeholders; Inquiries and questionnaires; Negotiations within the consortium; Invitations to the kick-off seminar; The large kick-off seminar in Tampere, December.

Year 2: Meetings concerning specific themes of the project; Inquiries and questionnaires; Clarifying the role of the different stakeholder groups; Information about the first results of the project to stakeholders; Annual seminar; Circular to the stakeholders including the main contents of the annual progress report;

Year 3: Meetings concerning specific themes of the project; Clarifying the role of the different stakeholder groups; Information about the results of the project to stakeholders; Annual seminar; Practical demonstrations; Circular to the stakeholders including the main contents of the annual progress report;

Year 4: Meetings concerning specific themes of the project; Information about the results of the project to stakeholders; Annual seminar; Practical demonstrations; Circular to the stakeholders including the main contents of the interim report

Year 5: Revising the role of the different stakeholders; Meetings concerning specific themes of the project; Information about the results of the project to stakeholders; Annual seminar; Practical demonstrations

Year 6: Meetings concerning specific themes of the project; Information about the final results of the project to stakeholders; Practical demonstrations

Year 7: Final seminar; Publications.

Letters of commitment

Acknowledgements

Funding program

Project

Call

The proposal was submitted in 2015 to the Strategic Funds of Academy of Finland. It was rejected as too ambitious and having low commercial potential. We strongly believe that proposed Unicorn environment and growing community of it's developers can have an abandant commercial succes. The authors are open for any futher funding suggestions and also forming new consortiums.

Hosting institution

University of Jyväskylä

Ethics and security

Author contributions

Conflicts of interest

References

Adams P, Horridge M, Wittver G (2003)

MMRF-Green: a dynamic multi-regional applied general equilibrium model of the Australian economoy, based on the MMR and MONASH models. Centre of Policy Studies Working Papers (140)

.

Centre of Policy Studies Working Papers

140

.

Airaksinen M, Honkatukia J, Simola A, Vainio T (2015)

Economic impacts of the Energy Efficiency Directive - regional CGE approach

.

18th Annual Conference on Global Economic Analysis

.

Dixon PB, Rimmer MT (2001)

Dynamic General Equilibrium Modelling for Forecasting and Policy: A Practical Guide and Documentation of MONASH

.

Contributions to Economic Analysis

URL: https://doi.org/10.1108/s0573-8555(2001)256 DOI: 10.1108/s0573-8555(2001)256

Durbin J, Koopman SJ (2012)

Time Series Analysis by State Space Methods

.

Oxford University Press

URL: https://doi.org/10.1093/acprof:oso/9780199641178.001.0001 DOI: 10.1093/acprof:oso/9780199641178.001.0001

Granlund K, Rankinen K, Lepistö A (2004)

Testing the INCA model in a small agricultural catchment in southern Finland

.

Hydrology and Earth System Sciences

8

(

4

):

717

‑

728

. DOI: 10.5194/hess-8-717-2004

Helske J (2013)

Estimating aggregated nutrient fluxes in four Finnish rivers via Gaussian state space models

.

Environmetrics

4

(

24

):

237

‑

247

. DOI: 10.1002/env.2204

Helske J (2015)

Kalman Filter and Smoother for Exponential Family State Space Models.R package version 1.1.1

. URL: http://cran.r-project.org/web/packages/KFAS/index.html

Honkatukia J (2013)

The VATTAGE Regional Model VERM - A Dynamic, Regional, Applied General Equilibrium Model of the Finnish Economy

.

VATT Research Reports

171

:

1

‑

277

. DOI: 10.2139/ssrn.2284665

Honkatukia J, Simola A (2011)

Selvitys Suomen nykyisestä ja tulevasta puunkäytöstä

.

VATT Research Reports

.

Horridge M, Madden J, Wittwer G (2005)

The impact of the 2002–2003 drought on Australia

.

Journal of Policy Modeling

27

(

3

):

285

‑

308

. DOI: 10.1016/j.jpolmod.2005.01.008

Huttunen I, Huttunen M, Piirainen V, Korppoo M, Lepistö A, Räike A, Tattari S, Vehviläinen B (2015)

A National-Scale Nutrient Loading Model for Finnish Watersheds—VEMALA

.

Environmental Modeling & Assessment

21

(

1

):

83

‑

109

. DOI: 10.1007/s10666-015-9470-6

Hydman RK (2008)

Forecasting with exponential smoothing. The state space approach

.

Springer

Hyndman R (2015)

forecast:Forecasting functions for time series and linear models R package version 5.9

.

Forecast

URL: http://CRAN.R-project.org/package=forecast

Kotamäki N, Pätynen A, Taskinen A, Huttula T, Malve O (2015)

Statistical Dimensioning of Nutrient Loading Reduction: LLR Assessment Tool for Lake Managers

.

Environmental Management

56

(

2

):

480

‑

491

. DOI: 10.1007/s00267-015-0514-0

Lyuten P (2011)

COHERENS — A coupled Hydrodynamical-Ecological Model for Regional and Shelf Seas: User Documentation. Version 2.0

.

Royal Belgian Institute of the Natural Sciences

.

Mano A, Malve O, Koponen S, Kallio K, Taskinen A, Ropponen J, Juntunen J, Liukko N (2015)

Assimilation of satellite data to 3D hydrodynamic model of Lake Säkylän Pyhäjärvi

.

Water Science & Technology

71

(

7

):

1033

. DOI: 10.2166/wst.2015.042

Pohjola MP (2011)

Pragmatic knowledge services

.

Journal of Universal Computer Science 17

17

:

472

‑

497

.

Reichert P, Langhans S, Lienert J, Schuwirth N (2015)

The conceptual foundation of environmental decision support

.

Journal of Environmental Management

154

:

316

‑

332

. DOI: 10.1016/j.jenvman.2015.01.053

Ropponen J, Huttula T (2014)

Transport model application in River Vuoksi. In Suito H. (ed.): Project report for cooperative program between Japan and Finland; Assessment Tools for solving Aquatic Problems

.

Project Report Okayama University

58

‑

68

. URL: http://ousar.lib.okayma-u.ac.jp/metadata/52687

Tattari S, Koskiaho J, Bärlund I (2009)

Testing a river basin model with sensitivity analysis and autocalibration for an agricultural catchment in SW Finland

.

Agricultural Food and Science Vol

18

:

3

‑

4

.

Virta H, Rosqvist T, Simola A, Perrels A, Molarius R, Luomaranta A (2011)

Ilmastonmuutoksen ääri-ilmiöihin liittyvän riskienhallinnan kustannus-hyötyanalyysi osana julkista päätöksentekoa - IRTORISKI-hankkeen loppuraportti.

Ilmatieteen laitoksen raportteja

3

:

1

‑

97

.

Executive summary

Keywords

List of participants

Third parties involved in the project

State of the art and preliminary work

Relation to the work programme

Objectives, Concept and Approach

Scientific and technical challenges

Societal challenges

Sustainability model

Impact

Concrete manifestations of the technology revolution and the benefit to Finland

Human activities, institutions, and behavioral changes that are needed to exploit the tech. revolution

Public measures to best support the process of change in such a way that the transformation proceeds in a controlled manner, and end with Finland to benefit technology revolution

Equal adaptation of capabilities and human resources in the individual, group and institutional level for the system reform

Implementation

Budget

Timeline

Main actions during the project years

Letters of commitment

Acknowledgements

Funding program

Project

Call

Hosting institution

Ethics and security

Author contributions

Conflicts of interest

References

Supplementary material

Executive summary

Keywords

List of participants

Third parties involved in the project

State of the art and preliminary work

Relation to the work programme

Objectives, Concept and Approach

Data management and sharing of the products of research

Scientific and technical challenges

Societal challenges

Sustainability model

Impact

Concrete manifestations of the technology revolution and the benefit to Finland

Human activities, institutions, and behavioral changes that are needed to exploit the tech. revolution

Public measures to best support the process of change in such a way that the transformation proceeds in a controlled manner, and end with Finland to benefit technology revolution

Equal adaptation of capabilities and human resources in the individual, group and institutional level for the system reform

Implementation

Budget

Timeline

Main actions during the project years

Letters of commitment

Acknowledgements

Funding program

Project

Call

Hosting institution

Ethics and security

Author contributions

Conflicts of interest

References

Supplementary material