Research Ideas and Outcomes : Grant Proposal
|
Corresponding author: Timo Huttula (timo.huttula@ymparisto.fi)
Received: 16 May 2016 | Published: 16 May 2016
© 2016 Pekka Neittaanmäki, Timo Huttula, Juha Karvanen, Tom Frisk, Jouni Tuomisto, Antti Simola, Tero Tuovinen, Janne Ropponen.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation: Neittaanmäki P, Huttula T, Karvanen J, Frisk T, Tuomisto J, Simola A, Tuovinen T, Ropponen J (2016) Unicorn–Open science for assessing environmental state, human health and regional economy. Research Ideas and Outcomes 2: e9232. doi: 10.3897/rio.2.e9232
|
Open data and models are becoming increasingly available, but there are not yet good methods and platforms to turn those into systematic evidence-based decision support. Unicorn will produce such an environment based on existing theoretical and practical knowledge about decision support and models. This consortium possesses the necessary models, data, and skills to set up an environment and demonstrate its functionality and usefulness with several case studies related to the environmental issues, human health, and economy. The Unicorn environment will be built in a generic and systematic way so that it could even become an international standard for evidence-based decision support.
Developing a technical environment or standard is not enough. Using the Unicorn environment is a large cultural change for both researchers and decision makers, as the current decision support practices do not reflect the principles of openness, criticism, or reuse. Therefore, this cultural change must be promoted by training to use the environment, by informing the society about its possibilities, and solving a number of practical and technical problems related to current practices in research institutes, ministries, and municipalities. We acknowledge these problems and offer solutions to them with an extensive interaction plan.
Open data, open models, environment, human health, regional economy
Consortium leader (PI):
Group leaders:
Associated partners:
This research project will be a game-changer in political and governmental decision making by enabling the use of scientific information, which is accessed and analyzed in a unique framework containing multimodal data from different sources and most novel data analysis tools to interpret it. The aim of this project is to provide efficient tools that will work for both decision-making processes (
We are in a situation where a lot of good quality scientific data is openly available in national and international databases (avoindata.fi, www.data.gov, thegovlab.org). Also there is a clear societal strive to provide and use open source models and evidence-based decision making and support their open use and development.
A major problem is that there is no single physical or virtual place where all this data can be combined, preprocessed, analyzed and visualized easily by researchers for decision makers. The data preprocessing routines are laborious, and it is typical that every researcher and software provider develops their own unique routines for this to facilitate their own work. However, the most important societal needs require collaborative multidisciplinary attempts to solve relevant problems or questions. The processes must be reliable, reproducible, and transparent to support these studies effectively and efficiently. Shared practices, tools, data, working environments and concerted actions are the way forward to improve science and decision support. This is true in all areas, but in this project we will start from environment, human health, and the regional economy, as they are complex and challenging enough to offer a good test bed for general development.
It is not enough that experts push data to politicians. There must be practices for mutual communication: experts must answer policy questions in a defendable and useful way; decision makers must more clearly explain their views using evidence; and there must be ICT tools to support this exchange. The focus is on end-users. This consortium has already developed and tested prototypes of such practices and tools in several projects, and is now ready to apply them in the society on a large scale.
The partners of Unicorn are highly competent in their respective fields in handling data at every stage of a decision process and in producing useful, timely, and accurate information for the decision makers. The results of such work have to be understandable and traceable back to original data: no black box solutions. Unicorn can provide that and also critical comments and recommendations during the work.
The consortium relies on latest methods and tools from statistics, data processing and simulation. Some researchers will focus on methodology of the Unicorn environment. Mathematical and statistical methods will be tailored and tested inside the environment. Methods for processing information and big data will be considered from theoretical and practical points of view. Skilled people will construct the interfaces between e.g. databases, preprocessing tools, mathematical models, and post-processing tools. The utility of the environment will be demonstrated with case studies. The environment will not be a single product in traditional sense, but a systematic and coherent set of tools and practices, useful for everyone in different ways.
In this project, we have chosen open policy practice as the basis for the work. Open policy practice is an existing method for decision support (
Unicorn environment builds upon already existing projects, models and case studies at national and international level, complementing and improving on the knowledge generated by them. Most of these projects involve several members of this consortium, which will ensure that potential interactions between projects might be more easily established. For example, Opasnet, the web-workspace for performing open assessments and facilitating open policy practice will be used. Also, the present project has a close connection with the ongoing CONPAT project funded by the Academy of Finland AKVA research program, where investigation of health effects and economic implications of microbial and chemical contaminants in the Kokemäenjoki River watershed downstream from City of Tampere are already underway. Other existing tools will also be utilized in this project, such as www.jarviwiki.fi and www.vesinetti.fi.
In summary, proposed project produces a new, open virtual work and modeling environment that combines open information from multiple databases and builds up tools for efficient policy studies. This will improve decision-making processes in Finland and other countries and greatly streamline the workflow in multi-disciplinary projects. Previously, similar projects have been done in more closed and less reusable settings. Especially georeferenced data will be utilized in more efficient ways in several case studies.
Open data and models are a mega-trend and will change the world. Unicorn directs this trend to paths that are the most beneficial for societal decision making by providing quick, reliable and efficient decision support. This improves the Finnish political and hence economic infrastructure. Significant saving of resources will be manifested with improved data collection, analyses and modeling. Also, the quality and amount of assessments that can be done to support work.
The objectives of the project are:
To modernize the working practices of researchers and decision makers e.g. by improving their use of existing data and models.
To test, implement, and demonstrate large, multidisciplinary studies utilizing novel, efficient practices and environments.
We hypothesize that the major challenges related to evidence-based decision making actually are about changing the practices of researchers and decision makers. However, the change is slow and discussion feeble until there is practical tools (such as the Unicorn environment) to support and demonstrate the new practices.
Scientific breakthroughs and progress seen
Unicorn has top modelers and researchers from several different fields of research. In addition, they already possess unique expertise and functionalities that can be directly used for developing the Unicorn environment. In this very consortium, there is a high potential for synergism by combining different data sources, models and platforms, and decision support practices.
The environment will be adaptable to other disciplines, methods will be flexible, and assessments can be checked and re-used anywhere in the world, thus increasing reliability and penetration of knowledge. It has a potential of becoming a standard for decision support expert work. The government research institutes such as SYKE, THL and VATT have a specific interest in long-term maintenance of an environment that supports their main objective. For publication plan, see the Interaction plan.
Exporting of knowledge of best practices
Finland is famous for producing many crucial information-related products that are now used worldwide as de facto industry standards: Linux operating system, SSH encryption protocol, MySQL database, SMS text message. All these breakthroughs were developed in Finland and distributed freely for anyone to use. These products have changed the global market of their respective fields. They have been game changers that forced everyone to rethink what an operating system or database means. They have also created a rich ecosystem of service providers on top of the key innovation, rather than producing direct sales income.
Our aim is similar. The revenue comes from selling expertise or knowledge services enabled by the Unicorn environment or its derivatives. Free distribution of the main innovation is actually crucial in spreading the idea to worldwide use and in creating global demand for the product. If the Finnish private sector is ready, it will get the first shares of the new market. However, it should be remembered that the societal benefits of implementing open knowledge practices are larger than the incoming cash flow stimulated by them and materialize as quicker and more justifiable societal actions.
Introduction
We will develop the Unicorn environment using flexible exploratory methodology and design science. During the 6 years of the project, we need at least two development stages. The architecture must be modular to provide flexibility with interface generations. The first stage focuses on reusability of components, while the second focuses on efficiency.
An open map service will be a key functionality for using location-based information and showing results on a map. The project will complement existing data with focused experiments. A geographical pilot will be the waters downstream Tampere, where we focus on water quality, accidental releases, their health effects and risk management, and the regional economy and air quality. The background studies have been made in the CONPAT- project funded by Academy of Finland.
Other pilots are in Kuopio region in Nothern Savo, where we looks at exploitation of natural resources and its unforeseen risks and in Kainuu region near the Talvivaara site, where SYKE and JU have existing aquatic prjects (e.g. MINEVIEW). VATT has produced long run structural economic projections for each NUTS 3 (Nomenclature of Territorial Units for Statistics (NUTS) is EU standard for referencing the subdivisions of countries for statistical purposes. There are currently 19 NUTS 3 level region in Finland (maakunnat)) level region of Finland. The work is implemented with regional computable general equilibrium (CGE) modeling techniques and it uses dialogue with regional experts to exploit their tacit knowledge, e.g. about working population trends.
Research materials and their management plan
The issue with material management in the and after the project will be the most core of the project. The Unicorn environment has to provide robust and efficient queries using many databases and in same time, there are restrictions and rules for the data available. Meanwhile, all the studies should be reproducible. Moreover, even the aim is to use open approach, the privacy setting and authentication during the research-phase is needed. The aim is to tackle on these difficulties when requirement specification will be implemented.
We will utilize several existing databases and platforms. Opasnet workspace, designed for decision support, has open models on e.g. burden of disease, air pollution, and contaminants in food. These will be utilized and further developed in real policy cases (see e.g. WP5). Also other cases will be implemented; tentative topics include radon and dioxins where data is already available. An overall objective is to produce a holistic burden of disease model covering all major environmental and lifestyle factors in Finland; however, only a part of this will materialize within this project.
RYMY is a national water- and foodborne outbreak notification and reporting system maintained by National Food Safety Authority Evira and THL. Drinking water and bathing water monitoring data comprising about 120 000 data points per year will be collected via this reporting system tentatively beginning from 2016. We will assess quality of drinking water together with other relevant environment and health data. We will utilize surveillance and quality data for management actions and decision-making at different administrative levels. Data on environmental health exposure is made available through YHTI database. The VAHTI database is an emissions control and monitoring database of the Finnish Environmental Administration. We will use EU-INSPIRE Directive compliant nationwide open spatial data sets, such as Finnish Meteorological Institute’s open meteorological data, which are already being utilized by SYKE’s group participating in this project. Also National Land Survey’s relevant data products, like elevation data, are included, as well as SYKE’s lake depth data products.
The environmental databases at SYKE contain nationwide time series of hydrological observations including surface and waters and representative sites for ground waters. Similarly SYKE’s Hertta-system contains hydrochemical and hydrobiological data. The interfaces to this data warehouse are under construction and will be completed in the beginnig of 2016. Another important data source maintained by SYKE is the operational WSFS-Vemala-model (
Health impacts are like a pyramid: severe rare cases on top and milder common symptoms at bottom; severe cases are more likely to be recorded. At THL the surveillance pyramid has (i) deaths registered by the Statistics Finland, (ii) Hospitalizations registered by the THL to the Hilmo register, (iii) microbiologically confirmed case registered by the THL to the National Infectious Disease Register (TTR), and (iv) visits to doctor in the primary health care, registered by the THL to AvoHilmo register.
VATT models are based on detailed national input-output tables (
In this project, we are developing the Unicorn work environment and practices as a more efficient way to process large amounts of varying data using the latest methods and tools. The aim is to utilize the current data and technology revolution with its full potential. Transparent decision-making process and openness is a megatrend. Data and mathematical methods should be open and re-usable by researchers and policy experts. This will be a key means for governments to discuss and construct trustworthy opinions.
The technology revolution of open data and open models will completely change the way we think about evidence-based decision making (
This process is much more complex than a typical collaborative writing task. Indeed, it is based on large assessments about impacts of policy options and thorough expertise on the underlying questions, both scientific and value-based. Still, we have been able to identify critical rules and practices for such a process, and to develop tools to support the practices. We have also developed tools to collect estimates from experts for computational models. Although the initial development of models may be laborious, they are designed to be re-usable within the environment in other, similar cases.As this practice becomes more prevalent, we expect that the society will start demanding better evidence to back up a decision before it is socially accepted. In other words, it reduces the survival of poor policy initiatives.
Open data and models will change the world and Unicorn directs this trend to paths that are the most beneficial for societal decision making by providing quick, reliable and efficient decision support. The government research institutes spend hundreds of person-years in old-fashioned data analyses and modeling efforts in decision support. Significant saving of resources will be manifested with improved data collection, analyses and modeling. Also, the quality and amount of assessments that can be done to support work in e.g. municipalities is significantly increased.
The virtual environment will be initially set up with ecological, health, and economic aspects (including complex interactions and spatial data). However, the system will be applicable to other sectors and we expect expansion of use. The virtual environment has a great potential in Finland and also exports value: economically, we see a market for expert-based consulting and modeling in all societies utilizing technological break-troughs.National databases make it possible to create an assessment and simulation tool that can be used for answering local questions, such as mercury in a local lake (see WP5). This would be an improvement to aquatic modeling and leading also improvement to the current general-level recommendations of fish intake, which may partly lead to unnecessary avoidance of fish in the diet or anxiety.
Unicorn partners already have practical experience in virtual environments and online modeling. There are technical challenges like user and data exchange interfaces (see WP3 for our solutions). However, our experience has been that cultural and learning challenges are clearly larger: e.g. experts do not often accept the principles of openness of data and criticism by non-experts; there are worries about merit accumulation; and open programming languages are not familiar. Constant communication, positive examples, technical support, systematic small-scale testing and incremental improvements, and political support from the employer and research funders can overcome these challenges. Also the policy sector has its challenges: non-familiarity of assessments and scientific data and lack of resources and time in single policy processes. The previously mentioned methods apply here, but in addition, experts have to carefully listen to the information needs of decision makers. See WP2 for our solutions.Public measures to best support the process of change in such a way that the transformation proceeds in a controlled manner, and end with Finland to benefit technology revolution.
The consortium will work closely with the governmental stakeholders in development and utilization of the Unicorn environment and open developer forum. Pirkanmaa Centre for Economic Development, Transport and the Environment, key actor in its region, will coordinate collaboration with Stakeholders. There are also committed associate partners to collaborate with consortium. They include ministries of environment (YM), social affairs and health (STM) and agriculture and forestry (MMM), National Land Survey of Finland and National Supervisory Authority for Welfare and Health (Valvira), Council of Tampere Region, Open Knowledge Finland Ry and Hahmota Oy. We are also actively collaborating with Kuntaliitto (association of municipalities) in this area. We will actively spread information and recruit new associated partners during the course of the project. Especially we are interested to find them from other geographical regions in Finland.
The Unicorn consortium will utilize the data resources generated during the project as part of routine legal requirements of environmental monitoring and public health response. For example, local general practitioners in all municipalities of the country report the citizen health information to the AvoHilmo database. In the near future, also local health protection authorities or local laboratories will submit their data to the YHTI environmental health-monitoring database. In the central government is currently going to a number of different projects with information resources sharing to develop. The project will support the development of co-ordination and aims to utilize resources effectively by various institutions. The promotion of the use of models provides significant advantages in the production and utilization of information.
Science is powerful in rejecting ideas that are not consistent with observations. Therefore, science should be actively used to estimate what impacts the actions considered could or could not have. Ineffective actions are rejected; uncertain actions are tested in small scale, and then poor actions can be rejected. Thus, the role of experts is to reject poor ideas, and the role of decision makers is to choose among the remaining good ones.
Experts' know-how is exported as much as possible in the form of automated tools to enable more efficient use of time and resources which can be better targeted to make reliable estimates instead of mechanical data processing. People think that they have a right to be heard and their opinions and concerns to be acknowledged. Often conflicts in a society occur because some group thinks they were not heard and they cannot influence their own case; then they lose trust. Shared understanding is a systematic method that has been developed and implemented in THL. It offers a channel for citizens to be heard also in this project.
The project implementation is conducted in seven work packages (
WP1. State of art review and inventory of the resources available
Responsible group leader: Prof. Pekka Neittaanmäki (Dep. of Math. Inf. Tech. JYU)
Research Group: Ph.D. Tero Tuovinen, Ph.D. Annemari Soranto
Schedule: M1-M5, M37-M41, (5+5) Linking to: WP2
Description: In this work package we will identify the best practices of available open databases and models. Already publicly available projects, codes, interfaces and modeling frameworks that can help in the realization of Unicorn will be listed and studied. These include domestic databases and modeling environments such as THL’s Opasnet, and international codes such as the Open Modeling Interface, OpenDA data assimilation tools, OpenEarth tools and Earth System Modeling Framework. There are several large scale domestic ICT-networks and projects going on like VALTORI, ENVIBASE. They will be identified and the collaboration will be established when needed.
We are looking for the latest technological advances, especially from big data and data-analysis and integrate the system that combines latest knowledge of tools and data with easy usable user-interfaces. The work begins by defining the user requirements for the system (Unicorn) and planning for the implementation phase. The required human resources will be chosen after defining based on the knowledge that is needed. Especially human interaction with the system will be focused, because the idea is to spread the solution for large amount of researchers and decision makers. An extensive open data source mapping will also done. It will cover all the state research institutes for identifying the state of the art.
Tasks:
Deliverables:
WP2 Stakeholder contributions and dissemination
Responsible group leader: Research Prof. Tom Frisk (ELY Centre Pirkanmaa)
Research Group: Prof. Pekka Neittaanmäki/JY, M.Sc. Ämer Bilaletdin/ELY, MSc Matti Saura/ELY, Ph.D. Tero Tuovinen, Ph.D. Annemari Soranto/JY, Saija Koljonen/SYKE, M.Sc. Antti Simola/VATT, Kati Valpe/JY
Schedule: M1-M72 (27+36) + (9+12)
Linking to: All workpackages and partners.
Description: Unicorn will develop a unique environment for decision knowledge support. Anyone (company, research institute, university, or parliament) can use it or set up an instance of their own for their own purposes or for distributing their own knowledge to others. This will produce global demand for expertise about the Unicorn environment and practices, and thus dissemination needs (see WP2).
In this workpackage the consortium partners will provide a synthetic view of overall achievements of their multidisciplinary research results and update it during the course of the project. The research outcomes will be used to support the decision making of the stakeholders. Our main role is to identify the benefits of involvement of the stakeholders, to identify appropriate stakeholders and the ways to work with them, to inform about the scope of the research and to share information, and to choose the best techniques for the engagement of stakeholders. The first, and perhaps the most critical, step in the stakeholder engagement process is to identify why the engagement activity is necessary, what outcomes are aimed at, and the scope and the context of the engagement.
Dissemination activities will be focused on two major aspects of the project. Firstly, dissemination through the events, like seminars, workshops and information days, organized by the consortium, and secondly, sharing of project outcomes. Both of the aspects will be implemented through a variety of means during the project. One method to do dissemination will be project websites, where we will collect all the necessary material about efficient use of Unicorn environment. The website will archive externally and internally accessible material, such as presentations, minutes from network meetings and workshops and progress reports. Scientific and technical results will be disseminated to the wider scientific audience throughout the entire duration of the project and beyond by presenting relevant results in scientific journals and conferences. Incorporating the main results of the project in regular university courses will also do dissemination.
Furthermore, the aim is to organize several events that support the gaining and sharing the information. We will do the small workshops, where we will invite 2-5 specialists on the topic on focus to present and discuss about the latest advances and possibilities. Moreover, we will organize yearly seminar that bring the collaborative network together and update the status of the project and overall. Finally, we will organize final symposium and by providing a common framework of design including all the models developed in the network.
The one way to do dissemination during the project will be organized training events. We are interested in the user experience and usability of the environment. Organizing the training events, we will get feedback and first-hand information about usability of our solutions. Moreover, it will increase the number of end-users and interest to all this system will increase.
Tasks:
Deliverables:
WP3 Implementation of the Unicorn environment
Responsible group leader: Prof. Pekka Neittaanmäki (Dep. of Math. Inf. Tech. JYU)
Research Group: T. Tuovinen, A. Soranto, P. Korhonen/SYKE, M.Sc. A. Simola/VATT, J. Tuomisto/ THL
Schedule: M6-M36, M37-M72, (31+36)(36+36) (144 mmonth) Linking to: WP1, WP2, all other WPs.
Description: This work package is the practical and concrete core of the project. In this package, spanning nearly the whole project duration, we will implement and program the Unicorn environment. In WP1, we define the overall approaches for the implementation (tools, methods, targets) based on feedback from other work packages, especially WP1 and WP2. We will build up the requirements for the system based on the overall objectives and using open discussion. The key functionalities in this package are planning and design of concrete components and structures, building up the basis, programming of the routines, implementing of functionalities, verifying the results and solutions and finally optimization of Unicorn environment for research use. We will utilize existing open source solutions when available. The implemented solution will be measured by its reliability, usability and efficiency. Recommendations for after-project development will be described and documented. All solutions that are used will be openly documented.
In this package we will implement interfaces for open databases and links between the Unicorn environment and several model codes. Moreover, we will build up a test bed and challenge the environment for several developed Big Data analyzers. The results will be analyzed and documented for later use. During the project’s first phase, the aim is to build up demonstrator level environment. In the second phase, we are focusing on efficiency and usability of the Unicorn. Our target is not a fully commercial software package, because the production phase will be too time and resource consuming. However, companies can easily utilize major parts of the implementation after the project because of our open approach. Moreover specific problems and solutions will open markets for the business.
Tasks :
Deliverables:
WP4 Unicorn pilot case case: demonstration related to the environment
Responsible group leader: Adjunct Prof. Timo Huttula
Research Group: PhD. Janne Juntunen, Dr. Tech. Olli Malve, M.Sc. Janne Ropponen, PhD. Saija Koljonen, M.Sc Niina Kotamäki, M.Sc. Esa Hirvonen, M.Sc. Päivi Korhonen,, Ph.D. Antti Simola/VATT
Schedule: M1-M36, M37-M72, (100+80)
Linking to: WP3, WP5
Description: There is a strong pressure to migrate from environmental monitoring to environmental modeling because of a need for both more efficient use of resources and more relevant, nationwide results. The goals of this work package are:
We will show that by using Unicorn we are able to build a usable hydrological or water quality model or chain of models with first guess parameterizations, and can produce reasonable results without extensive model tuning. This enables us to study rapidly developing situations in previously unmodelled areas. The models utilized range from a box model (LLR,
For flow and transport model input we will use bathymetry, DEM (digital elevation model), hydrological and meteorological data as well as data on the simulation state variables that are already available in some form at various sources. For example, the operational hydrological system WSFS provides simulated hydrological data and forecasts (water levels and discharges) from all river and lake systems in the country and can be used as input for transport modeling if observational data is unavailable.
Reproducing existing, manually crafted models within Unicorn does not show the true potential of the system since prior knowledge of the challenges encountered during model development will be taken into account when implementing the model setup. Therefore we need complementary demonstrations to assess the capability of the Unicorn environment. We will use Unicorn to model a new area, the lakes in Talvivaara region. Another pilot site will be Lake Kallavesi, which is used as the raw water source for making artificial ground water for the City of Kuopio. We have previously shown that even a modestly calibrated lake model combined to a simple data-assimilation scheme clearly improves the prediction capability of the model compared to using the model or data alone (
Furthermore we are able to use the long term monitoring data for both chemical and biological scenarios based on the hydrodynamic models. This will enable us to respond swiftly to diverse environmental challenges (chemical fate models) and it will also work as a tool for directing environmental measures (e.g. prioritization of restorations).
Tasks:
Deliverables:
WP5: Unicorn pilot case: Demonstration related to human health
Responsible group leader: Adjunct Prof. Jouni Tuomisto (THL)
Research Group: Jukka Jokinen/THL, Arja Asikainen/THL, Tarja Pitkänen/THL, Sari Ung-Lanki/THL, Mikko Virtanen/THL, Saija Koljonen/SYKE.
Schedule: M1-M36, M37-M72, (91+44 mmonth) Linking to: WP3, WP4, WP6
Description: The aim of work package is to demonstrate and evaluate the functionality, accuracy and usability of models built within the Unicorn environment in the pilot cases related to environment and health. Also, we will develop and implement practical end-user interfaces for citizens and municipalities. We have chosen three cases, within we will combine environmental exposure data and citizen health data into a total burden of disease model that can be used as a basis for further assessments and tools.
Methylmercury is a persistent environmental pollutant and neurotoxin originating from both natural and industrial sources and contaminating fish in lakes. Spatial differences are large, and therefore customized recommendations are valuable for health protection authorities and people eating fish. In Unicorn, we will open the large methylmercury data measured by SYKE during the last decades and develop an open online model for fishing recommendations. Also economic impacts will be explored, as the results may impact the reputation of some summer cottage lakes. Thus, this work will be done in close collaboration between THL, SYKE, VATT, local authorities, and stakeholder groups.
Indoor environment quality is an important health issue, as a large fraction of Finnish people suffer from indoor problems. This is also a major economic issue as exemplified by the 30-50 billion euro "renovation debt" in Finnish housing stock due to moisture damages. In schools the renovation debt is 3.7 billion euro, and this calls for action in municipalities. THL will produce an online indoor air questionnaire to schools and day cares including questions about students’ health and indoor environment quality. Interpretation guidance of these results will be produced based on existing reference material. Data collection will be conducted with an online questionnaire, and data will be nationally collected to Opasnet database. An automated reporting system for individual schools and municipalities will be developed. Monitoring of indoor environment quality is regularly done in schools by the health protection authorities. Unicorn will produce an online data collection interface that can be used to combine monitoring and questionnaire data, and possibly the air pollution data measured by the municipality, for school level decision support. This requires collaboration with THL and municipalities, and VATT for economic evaluation.
Drinking water causes 4-5 waterborne outbreaks annually (according to the national outbreak notification and reporting system RYMY), 20-30 cases of drinking water quality deterioration, and sporadic waterborne illnesses caused by Campylobacter, Giardia, and Legionella. Municipality waterworks make more than 100 000 chemical and microbiological analyses from drinking water annually, but the data is underused both in municipalities and nationally. The national YHTI database (containing municipality data on health protection) is actively developing the management of these data. Unicorn will take that data as a part of its modelling system using ReplicaX (see WP7) and analyse it against health data available in THL. The aim is to increase awareness and capabilities of statistical analysis possibilities and offer decision support to municipalities about preventive management. This is a close collaboration between THL, Valvira (National Supervisory Authority on Health and Welfare), municipalities, JyU and VATT (for economic impacts in WP6).
Tasks :
Deliverables:
WP6 Unicorn pilot case: Demonstration relted to national and regional economy
Responsible group leader: Research Director Juha Honkatukia (VATT)
Research Group: M.Sc., Antti Simola, N.N. Schedule: M1-M36, M37-M72,
Linking to: WP3, WP4, WP5
Description: We will demonstrate the applicability of Unicorn in assessing the linkages between environmental state and regional economy in Conpat-project region and two other regions. We will use existing VATT models and possibly yet to be specified open source statistical and CGE models in the Unicorn environment. VATT models rely on a detailed database that is unfortunately not open. Open, but less detailed version of the data will be applied in open source models when suitable in order to bring CGE techniques more available and open to decision makers.
As a background to these studies, the development of VATT models started in 1990s and has aimed at wide applicability in decision-making. Particularly the growing demand for quantitative policy analysis has ensured that VATT models have fulfilled the aim. Policy issues that affect several sectors or have opposing impacts are very often analytically intractable leaving computational analysis as the prominent way to do analysis.
The VATT models are computable general equilibrium (CGE) models of Finnish economy. The single country model VATTAGE (
One attractive feature of CGE models is that they conform to the national account systems. Thus the model results are interpretable in that context and the effects can be expressed as changes in economic indictors. Furthermore, the underlying input-output structure allows a consistent way to extend the economic analysis to material flow accounting. Consequently one of the main application areas has been interdisciplinary research with various collaborators. For instance, VERM applications include extreme weather events (
The aforementioned experience in interdisciplinary research is a good starting point for more general approach of an automatically generated modeling tool. Aside of equilibrium modeling, the VATT researchers have also experience in econometrics methods that are frequently used in model parameterization.
Private sector use of open data is already extensive. Public sector lags behind mainly because it faces more complex problems – the required information does not concentrate on market segments of a single commodity but to a whole mix of industries in the economy, its demography, long term investments in infrastructure and planning of land use, policies countering the externalities and distributional issues. National account systems were created in order to convey consistent information on national economies. It serves as a natural starting point for CGE models and organizing open data that would benefit public decision-making.
Drinking water management.
We examine the Conpat study area related to management of waterborne outbreaks and regional economy in collaboration with SYKE (WP4) and THL (WP5). The economic effects include direct effects on labor productivity and regional trade balance, and indirect effects on consumer behavior. The former derives straightforwardly from production theory, and the latter from prspect theory. CGE modeling is a consistent way for evaluating both direct and indirect effects simultaneously.
Regional economic consequences of Talvivaara mining operations.
For this task we construct a simple regional CGE model, which can feasibly solve multitude of times in Monte Carlo manner in order to account for uncertainty in economic and environmental outcomes. We use VERM or an open source alternative. With this ex post analysis we can demonstrate how an equivalent ex ante analysis would contribute to decision making by more balanced information of risks and unwanted consequences. Theoretical focus is in political economy of regional development. This is a close collaboration with SYKE (WP4) and local authorities.
School renovations.
We use VATTAGE to assess short and long term consequences of neglecting the renovation of schools with indoor problems. With correct timing, the renovation investments could serve as stimulus. It also has potential long-term productivity effects that are not optimized by markets alone. Thus our approach yields valuable information for public decision makers. In the short run analysis we assess the stimulatory effects of renovations. In the long run analysis we apply recent demographic extension of VATTAGE for evaluating long-term economic costs of shortsighted decision making. This is a close collaboration with THL (WP5).
Tasks:
Deliverables:
WP7 Statistical methods, models and big data
Responsible group leader: Prof. Juha Karvanen (Dep. of Math. and Stat., Univ. of Jyväskylä)
Research Group: Jouni Helske, N.N.
Schedule: M1-M72 (124 mmonth)
Linking to: WP4, WP5, WP6
Description: Governmental institutes should publish their data as open data whenever not prohibited by confidentiality requirements. In practice, many datasets collected e.g. by THL contain sensitive information such as personal level health data. Naive anonymization, i.e. removal of names, addresses and personal identity numbers, is not sufficient to make the data publishable because it is often possible to deduce identity from multivariate data using e.g. age, place of residence, profession, language or medical history.
Synthetic data is offered as a solution for this openness – confidentiality dilemma. Synthetic data or data replica is created by means of simulation and so that the statistical properties of the replica closely resemble the original data. The individuals in the original data cannot be identified from the replica and therefore the replica can be published as open data.
Synthetic data offers new possibilities for the citizen science. The program codes developed for the replica can be applied with original data without any changes. The publisher of the data can therefore easily verify the analysis results with the original data. This enables an operations model where some parts of the data analysis are carried out by enthusiastic citizens (e.g. university students) and the employees of the governmental institute coordinate the work.
The concept has been already piloted: R code implementation ReplicaX by Juha Karvanen won the challenge “Utilization of health data” in Apps4Finland 2013 competition. As a part of the project, ReplicaX will be developed further, tested extensively with real data and put in full-scale production use.
In order to efficiently combine multiple databases and models, state-of-the-art statistical methods are needed. As databases contain data with varying levels of uncertainty (stemming from data collecting strategies, modeling choices and sampling variation, among others), different sources of information must be weighted accordingly. For assessing these uncertainties, a Bayesian framework can be used to combine expert opinions, multiple data sources and models in way, which gives easily interpretable results in form of probability distributions. This enables decision makers to make sophisticated forecasts under alternative scenarios.
High proportions of the big data stored today are inherently time series. When building generic models, taking account the time dependency in the data is crucial in order to make proper inferences of the results. E.g. in (
Analyzing data with complex time and cross-sectional dependencies with varying sampling frequencies requires flexible models, which are robust enough yet still give meaningful results in realistic computational time. General purpose Bayesian modeling software such as OpenBUGS often requires considerable tuning of the estimation procedures, which are not well suited for time series data due to autocorrelation structures of the simulations relating to model estimation. Therefore it is important to build reliable and easy-to-use tools for analyzing various types of data from open databases. Similar but more restricted methods for forecasting uninvariate time series in frequentist framework were presented in (
Without proper software, analysts are forced to use their old methods whether they are suitable for the problem or not. Aim is to build an efficient and robust Bayesian modeling framework for an open source software R, which can be used to model multivariate time series data with complex patterns and varying sampling frequencies, taking account multiple sources of information and uncertainties related to data and model structure.
Tasks:
Deliverables:
The annual costs are presented in
Estimated UNICORN-project budget based on Finnish unit costs.
Year 1 |
Year 2 |
Year 3 |
Year 4 |
Year 5 |
Year 6 |
Year 7 |
Total costs (euro) |
|
Working time (m/m) |
39 |
144 |
137 |
136 |
110 |
102 |
49 |
6976590 |
Travel (euro) |
20000 |
38000 |
34000 |
38000 |
25000 |
35000 |
14000 |
204000 |
Material (euro) |
2000 |
4000 |
3500 |
500 |
500 |
500 |
0 |
11000 |
Machines (euro) |
25000 |
22000 |
20000 |
20000 |
17000 |
17000 |
17000 |
138000 |
Services (euro) |
40000 |
37000 |
47000 |
54000 |
42000 |
32000 |
31000 |
283000 |
Other costs (euro) |
10500 |
63000 |
15000 |
15500 |
13000 |
15000 |
13000 |
145000 |
Total (euro) |
97500 | 164000 | 119500 | 128000 | 97500 | 99500 | 75000 |
7757590 |
The main actions of project during the project years are as follows. A more detailed time char on the task level is presented in
Year 1: Identification of different groups of stakeholders; Inquiries and questionnaires; Negotiations within the consortium; Invitations to the kick-off seminar; The large kick-off seminar in Tampere, December.
Year 2: Meetings concerning specific themes of the project; Inquiries and questionnaires; Clarifying the role of the different stakeholder groups; Information about the first results of the project to stakeholders; Annual seminar; Circular to the stakeholders including the main contents of the annual progress report;
Year 3: Meetings concerning specific themes of the project; Clarifying the role of the different stakeholder groups; Information about the results of the project to stakeholders; Annual seminar; Practical demonstrations; Circular to the stakeholders including the main contents of the annual progress report;
Year 4: Meetings concerning specific themes of the project; Information about the results of the project to stakeholders; Annual seminar; Practical demonstrations; Circular to the stakeholders including the main contents of the interim report
Year 5: Revising the role of the different stakeholders; Meetings concerning specific themes of the project; Information about the results of the project to stakeholders; Annual seminar; Practical demonstrations
Year 6: Meetings concerning specific themes of the project; Information about the final results of the project to stakeholders; Practical demonstrations
Year 7: Final seminar; Publications.
The proposal was submitted in 2015 to the Strategic Funds of Academy of Finland. It was rejected as too ambitious and having low commercial potential. We strongly believe that proposed Unicorn environment and growing community of it's developers can have an abandant commercial succes. The authors are open for any futher funding suggestions and also forming new consortiums.
University of Jyväskylä