Laura Uusitalo defends her PhD thesis on Bayesian network modelling of complex systems with sparse data

On Friday the 11th of November 2022, PhD, M.Sc. Laura Uusitalo defends her PhD thesis on Bayesian network modelling of complex systems with sparse data: Ecological case studies. The thesis is related to research done in the Department of Computer Science and in the Spatiotemporal Data Analysis group.

PhD, M.Sc. Laura Uusitalo defends her doctoral thesis Bayesian network modelling of complex systems with sparse data: Ecological case studies on Friday the 11th of November 2022 at 13 o'clock in the University of Helsinki Main building, Room U3032 (Unioninkatu 34, 3rd floor). Her opponent is Associate Professor Simo Särkkä (Aalto University) and custos Associate Professor Laura Ruotsalainen (University of Helsinki). The defence will be held in English.

The thesis of Laura Uusitalo is a part of research done in the Department of Computer Science and in the Spatiotemporal Data Analysis group at the University of Helsinki. Her supervisors have been Associate Professor Laura Ruotsalainen (University of Helsinki) and Reader Allan Tucker (Brunel University London, United Kingdom).

Bayesian network modelling of complex systems with sparse data: Ecological case studies

This thesis discusses how Bayesian networks can be used to improve data analytics in the field of environmental assessment and management. The data-analytic challenge is that ecosystems are complex and potentially changing, while the available data are relatively sparse both in terms of the number of observations and in which ecosystem components they cover. This thesis takes steps towards better analysis of these sparse data through combining pre-existing, uncertain information such as modelling results and expert knowledge with modern, probabilistic data analysis.

The first theme of the thesis is how variable discretization in Bayesian network classifiers, particularly tree-augmented Naïve Bayes, can help understand the relationships between environmental factors at different levels. This work explores different discretizations of the class variable and discusses the implications of the differences between the resulting models, and contributes to the still relatively quiet discussion about discretization schemes in Bayesian networks.

The second theme is detecting change in the ecological processes based on the sparse, often noisy data. This thesis builds dynamic Bayesian network models to detect change in the Central Baltic Sea ecosystem interactions, and explores the effect of different model structures in detecting the ecosystem change. It is shown that the hidden variables of the models can identify ecosystem change, and that this result does not depend on the exact model structure or hidden variable set-up.

The third theme is decision support models that aim to integrate information regarding all interlinked aspects of the decision problem such as different parts of the ecosystems as well as economic and societal considerations. To be useful, decision support models need to be able to provide estimates of uncertainty of the different assessments and projections. This thesis reviews and evaluates various uncertainty assessment methods. Further, this thesis builds a large probabilistic meta-model to demonstrate how a Bayesian network based decision support model can be used to summarise a large body of research and model projections about potential management alternatives and climate scenarios.

Bayesian networks are showing their strength for different tasks of environmental data analytics. Elegant handling of missing data, explicit and rigorous handling of uncertainty, and the possibility to use prior scientific knowledge and data together in analyses in a transparent way are strong advantages for Bayesian analysis for environmental data that often contain missing values and are scarce. In addition to being flexible and, thus, able to integrate different types of information and data, they are transparent, allowing critical assessment and discussion of the models. This is important as environmental data analytics are often used to support decision making on the use of ecosystems, affecting the lives of current and future generations.

Avail­ab­il­ity of the dis­ser­ta­tion

An electronic version of the doctoral dissertation is available on the e-thesis site of the University of Helsinki at http://urn.fi/URN:ISBN:978-951-51-8633-1.

Printed copies will be available on request from Laura Uusitalo: laura.z.uusitalo@helsinki.fi.