Research

The MUPI group carries research on statistical machine learning and artificial intelligence.

Our main goals are to

  1. Develop computational techniques for learning under uncertainty, based on approximate Bayesian inference and probabilistic programming
  2. Solve interesting data-driven applications, focusing in problems short on data that hence need advanced modeling techniques based on data integration, semi-supervised learning etc.

Most of our current main activities fall under the first three topics (Probabilistic inference, AI for ultrasonics, and Virtual Laboratories) with active ongoing projects, but we still work also on the other topics listed here.

Probabilistic inference

Statistical machine learning provides tools for understanding complex data collections, using Bayesian inference to cope with uncertainty on model parameters originating from learning from finite data. We develop computationally efficient and maximally automatic approximate algorithms for Bayesian inference, in context of probabilistic programming and machine learning. Our goal is to allow the user to focus on model specification, not needing to worry about the specifics of inference.

Highlights:

  1. Efficient Riemannian Monte Carlo methods based on a metric with computationally efficient inverse (Hartmann et al. , AISTATS 2022; Yu et al. , TMLR 2023).
  2. Riemannian Laplace approximation (Yu et al. , AISTATS 2024).
  3. Integration of approximate posterior inference and decision-making, by calibrating variational approximations to bettere account for the eventual decision task (Kuśmierczyk et al. , NeurIPS'19) and by post-hoc calibtration with flexible decision-making modules (Kuśmierczyk et al. , AAAI 2020).
  4. Review on current state of prior elicitation with recommendations for future directions (Mikkola et al. 
    , Bayesian Analysis 2024).
  5. Using prior predictive distribution for expert knowledge elicitation (Hartmann et al. , UAI 2020) and for fast hyperparameter optimization of hierarchical Bayesian models (da Silva et al. , JMLR 2023).
  6. Bayesian inference under combinatorial constraints. (Klami and Jitta. , UAI 2016)

Current projects:

  1. Efficient Riemannian Inference (ERI, 2022-2024). Development of efficient geometric MCMC algorithms for Bayesian inference. Funded by .
  2. Computationally efficient inference on Riemann embedding manifolds (CORE, 2022-2025). Research Council of Finland postdoctoral researcher project of Marcelo Hartmann.
  3. , Agile probabilistic AI research theme.

Past projects:

  1. Scalable probabilistic analytics (SPA, 2016-2018). Development of computationally efficient variational inference algorihms for probabilistic programs. Funded by , , and . Collaboration with and .
  2. Reliable Automatic Bayesian Machine Learning (RAB-ML, 2018-2019). Development of reliable and efficient solutions for Bayesian machine learning and probabilistic programming. Funded by . Consortium with  and .
AI for ultrasonics

We build machine learning and artificial intelligence tools for modeling ultrasound propagation in complex environments. We develop methods e.g. for inverse problems (detecting fouling or deformations), focusing ultrasound for cleaning, and acoustic levitation. The work is done is done together with the group of  and  working on ultrasound physics, and we also collaborate with  that provides practical ultrasonic cleaning solutions.

The main activities are currently aiming for sustainable and safe cleaning of industrial production equipment, for reducing the environmental and economical harm of fouling that accumulates over time. We develop AI-enhanced sensing technologies for detecting and quantifying the fouling and for controlling the cleaning process so that the risk of damage is minimized.  

Highlights:

  1. First example of AI model for detection and monitoring of fouling during ultrasonic sensing, based on monotonicity. (Rajani et al. , MLSP'18).
  2. Detection and fouling quantification with Gaussian processes using line integral observations. (Sillanpää et al. , IEEE International Ultrasonics Symposium (IUS), 2019; Longi et al. , Uncertainty in Artificial Intelligence (UAI) 2020).
  3. Neural networks for localization of internal structure with chaotic cavity. (Sillanpää et al. , AIP Advances, 2021).

Current projects:

  1. Sustainable industrial ultrasonic cleaning (SIUC, 2023-2025). Ensuring the sensing and cleaning technologies can be used in sustainable and safe manner. The project has received funding from the European Union ( instrument) and is funded by the Research Council of Finland. Consortium with .

Past projects:

  1. Machine learning for ultrasonic cleaning (ML-UC, 2019-2023). Development of machine learning and artificial intelligence solutions for modeling ultrasound propagation in fouled structures. Funded by .
  2. Ultrasonic AI-powered Industrial Sensing Platform (UA-ISP, 2021-2023). Development of AI-driven distributed ultrasonic sensing platform for low-cost holistic monitoring of complex environments. Funded by  ICT 2023 program. Consortium with .
Virtual Laboratories

are a new perspective to scientific knowledge generation. Many of the elements in scientific discovery are general across scientific domains, and by isolating them from domain-specific elements (models, simulations, theories) we can develop AI techniques for assisting scientific discovery as well as industrial R&D more efficiently. Any research environment, for instance a natural science laboratory, can leverage on these techniques by framing their operations as a virtual laboratory.

Highlights:

  1. Klami et al. provides a high-level vision of the concept.
  2. , a proof-of-concept open source tool for setting up virtual laboratories

Current projects:

  1. Virtual Laboratories for pharmaceutical R&D (Business Finland Co-Research project, 2023-2025). Research towards establishing Virtual Laboratories in pharmaceutical research, in collaboration with research partners (Aalto), (Aalto) and (Univ. of Helsinki) and industrial partners , , , and .
Human behavior

Today it is easy to collect information about individuals by monitoring their activities, either based on explicit sensors or by log data collected by computers they interact with. We develop models required for inferring interesting and useful information based on such data, to describe, understand and enchance our daily life.

Highlights:

  1. Modelling risk behavior of individuals from observed data in conctext of computer games. (Tanskanen et al. , ACML'21)
  2. Models for learning personalised effect of interactions (personalised treatment effect of uplift) from highly unbalanced data collections. (Nyberg et al. , ACML'21)
  3. Modeling keyboard usage in programming education and other educational contexts. (Leinonen et al. , ACM TSCSE'16)
  4. Human computer interaction in touch-based information retrieval. (Andolina et al. , IUI'15)
  5. Understanding intentions based on brain signal analysis, in particular MEG recordings during natural tasks. (Kauppi et al. , NeuroImage 2015)

Past projects:

  1. Traces of Information: Intelligence from Fragmented Sources (ToI, 2013-2019). Academy Research Fellow project of Arto Klami.
  2. Machine Insight for Behavioral Analytics (MINERAL, 2019-2022). Business Finland Co-Innovation project. Consortium with (Aalto University) and companies.
Data integration

Machine learning research is often carried out in the context of elegant but simplified setups: It is assumed that all relevant data is provided in form a simple matrix or tensor. In most practical applications this is not the case, but instead we need to combine information scattered in multiple data sources of heterogeneous nature. We provide fundamental modeling solutions for combining such data sources.

Highlights:

  1. Bayesian canonical correlation analysis and inter-battery factor analysis (IBFA) and their extension Group factor analysis (GFA) for discovering relationships between more than two parallel data sources. (Klami, Virtanen and Kaski. , JMLR, 2013; Klami et al. , IEEE Transactions on Neural Networks and Learning Systems, 2015).
  2. Matrix factorization tools for discovering relationships between more complex setups and heterogeneous data. (Klami, Bouchard and Tripathi. , ICLR 2014; Klami. , ACML 2015)
  3. Cross-domain object matching for discovering relationships between data sources with no known pairing between objects. (Jitta and Klami. , Advanced methodologies for Bayesian networks, 2017; Klami. , Machine Learning, 2013).

Past projects:

  1. Traces of Information: Intelligence from Fragmented Sources (ToI, 2013-2019). Academy Research Fellow project of Arto Klami.
  2. Improved Learning by Combinin Information Sources (ILCIS, 2013-2015). Funded by Xerox Research Foundation, collaboration with Abhishek Tripathi () and .
Hyperspectral imaging

Hyperspectral cameras capture the full spectrum of light, instead of jus the three channels of red, green and blue that mimic the limited vision of humans. Having access to this richer information makes most computer vision problems easier. The existing HS cameras are, however, expensive and large. We develop a low-cost alternative that uses AI to process images captured with a passive add-on device that can be attached to any camera, brining HS imaging to smartphones and DSLRs. We also work on hyperspectral image analysis.

Highlights:

  1. Deep learning algorithms for hyperspectral image acquisition (Toivonen et al. , Machine Vision and Applications, 2020).
  2. Methods and applications for hyperspectral image interpretation: Luotamo et al. , IEEE Transactions in Geoscience and Remote Sensing, 2020 and Toivonen et al. , Annales Botanici Fennici, 2020.

Past projects:

  1. Mobile hyperspectral imaging and computer vision platform (2019-2021). Development of methods and algorithms for acquisition of hyperspectral images with mobile devices, their use for computer vision, and preparation for commercialization of the result. Funded by under the instrument.
Data-efficient modeling

Many modern machine learning models are complex and require large training data sets, which makes learning difficult in applications where labeling examples is costly or difficult. We study data-efficient techniques for learning complex models from limited supervision, by utilizing related learning tasks (multi-task learning, transfer learning) and unlabeled observations or external constraints (semi-supervised learning). We also develop solutions for changing enrivonments based on domain adaptation.

Highlights:

  1. Structured pseudo-labels for semi-supervised learning based on output-space constraints or smoothness assumptions. (Longi, Pulkkinen and Klami. , ACML'17).
  2. Transfer learning in computer science education. (Lagus et al. T. ACM Transactions on Computer Education, 2018:18(4)).
  3. Segmentation of large multispectral satellite images based on coarse annotations in limited-memory computational architecture. (Luotamo et al. , IEEE Transactions in Geoscience and Remote Sensing, 2020).

Past projects:

  1. Scalable probabilistic analytics (SPA, 2016-2018). Development of computationally efficient variational inference algorithms for probabilistic programs. Funded by , , and . Collaboration with and .