Research

The Machine and Human Intelligence group focuses on probabilistic machine and human learning.

We are interested in smart probabilistic algorithms, as implemented by brains and machines, that are robust and sample-efficient — ready to be used "in the wild". We leverage meta-learning and resource constraints as two key elements to build intelligent systems.

  1. Primarily, our research focuses on developing new "smart" probabilistic machine learning methods, in particular for approximate Bayesian inference and decision making, with a focus on surrogate-based and amortized methods that combine modern generative techniques, foundation models, and old-school Bayesian nonparametrics. 
  2. We also contribute to studying probabilistic inference and decision making in humans and other animals via our collaborations — both as a scientific domain of application of our methods and as a source of inspiration to understand intelligent behavior.

This page is likely to be slightly out of date as we keep expanding our research in novel directions. We recommend to visit our Publications page for more detailed and up-to-date information about past and ongoing projects.

Amortized Inference and Deep Learning

Our main focus is developing methods for amortized inference, where the goal is to meta-learn reusable computational solutions that can be quickly applied to new instances of inference and decision making problems in science, medicine and engineering. This approach combines the power of deep learning with Bayesian inference, allowing us to tackle complex inference tasks that would be computationally prohibitive with traditional methods. Our research focuses on creating frameworks that are efficient, scalable, and aim to maintain the principled nature of probabilistic inference.

Highlights:

  1. Our Amortized Conditioning Engine (ACE) is a novel transformer-based meta-learning framework that enables flexible conditioning on both observed data and interpretable latent variables (Chang, Loka, Huang, Remes, Kaski & Acerbi, 2025; AISTATS). ACE can incorporate user-provided priors at runtime and generates predictive distributions for both discrete and continuous data. It shows how multiple tasks - such as image completion, Bayesian optimization, and simulation-based inference - all fall within the unifying framework of probabilistic conditioning and prediction. Much of the ongoing work in our lab is an extension of this foundational paradigm.
  2. Relatedly, we investigate amortized experimental design and decision-making in works such as the Transformer Neural Decision Process (TNDP), that simultaneously suggests designs for experiments and infers final decisions (Huang, Guo, Acerbi & Kaski, 2024; NeurIPS).
  3. Our work builds on top of the large body of research on neural processes. We contributed to the field with our Relational Conditional Neural Processes, which uses relative rather than absolute information to efficiently implement translational and other equivariances (Huang, Haussmann, Remes, John, Clarté, Luck, Kaski & Acerbi, 2023; NeurIPS).
  4. Other areas of research for deep learning applied to inference includes developing methods for robust inference under model misspecification (Huang, Bharti, Souza, Acerbi & Kaski, 2023; NeurIPS) and exploring approaches to elicit high-dimensional probability distributions using preferential normalizing flows (Mikkola, Acerbi & Klami, 2024; NeurIPS).
  5. Finally, we have also collaborated on several projects related to inference for deep learning (as opposed to deep learning for inference). This includes improving neural network robustness through variational inference with node-based Bayesian neural networks (Trinh, Heinonen, Acerbi & Kaski, 2022; ICML); increasing ensemble diversity by promoting variation in the space of input gradients, encouraging networks to learn different features and improving overall robustness (Trinh, Heinonen, Acerbi & Kaski, 2024; ICLR); and techniques for data augmentation via multiplicative perturbations (Trinh, Heinonen, Acerbi & Kaski, 2024; NeurIPS).
Sample-Efficient Probabilistic Machine Learning

We develop probabilistic machine learning methods to perform optimization and approximate Bayesian inference with complex scientific models, with applications largely in (but not limited to) computational and cognitive neuroscience. To work with real models and data, our algorithms are robust to noise (e.g., due to Monte Carlo approximations or simulations) and sample-efficient in that they require a relatively small number of function evaluations with respect to traditional methods. Our algorithms are released as well-documented toolboxes (see our Resources page).

Highlights:

  1. Robust and sample-efficient Bayesian inference with models with and without likelihood, via Variational Bayesian Monte Carlo (VBMC). VBMC is a new approach to Bayesian inference that obtains good approximations of the posterior and model evidence with a small number of likelihood evaluations (Acerbi, 2018; NeurIPS). The framework has been expanded in various directions (Acerbi, 2019; PMLR), notably with the addition of support for noisy log-likelihood evaluations, such as those estimated via simulation (Acerbi, 2020; NeurIPS). We have also worked on applying surrogate modeling and active learning (as in VBMC) to the "embarrassingly parallel" setting (de Souza, Mesquita, Kaski & Acerbi, 2022; AISTATS), and we are expanding the scalability of the method (Li, Clarté & Acerbi, 2023; arXiv).
  2. Fast hybrid Bayesian optimization for model fitting via Bayesian Adaptive Direct Search (BADS). BADS is a fast optimization algorithm that combines a model-free approach (mesh-adaptive direct search) with a strong model-based algorithm (Bayesian optimization), achieving the best of both worlds, that is sample-efficiency and robustness to noise (Acerbi & Ma, 2017; NeurIPS). BADS is currently used by dozens of computational labs across the world.
  3. Estimation of log-likelihoods for simulator-based models via inverse binomial sampling (IBS). IBS is an efficient statistical technique to estimate the log-likelihood when the log-likelihood is unavailable, but we can generate simulated data from the model. Unlike other "likelihood-free" methods, IBS does not use summary statistics, but computes the log-likelihood of the full data set (van Opheusden*, Acerbi* & Ma, PLoS Computational Biology, 2020). Moreover, IBS combines very well with BADS and VBMC.
Human Probabilistic Inference

We investigate whether and how people's perception and decision making follow principles of probabilistic inference (the Bayesian brain hypothesis). In a nutshell, a Bayesian observer builds beliefs about states of the world based on observations and assumptions (priors) about the statistical structure of the world. For example, a consequence of Bayesian behavior, empirically observed in many experiments, is that different pieces of sensory evidence are integrated according to their respective reliability. In our work, we "stress test" the Bayesian brain hypothesis until it breaks — to uncover details of approximate Bayesian inference in the brain. We also use the Bayesian observer framework as an x-ray machine that allows us to infer internal representations (e.g., prior beliefs) in the brain. We explore these questions with mathematical modelling and computational analysis of behavioral experiments.

Highlights:

  1. Approximate inference and deviations from Bayes-optimal behavior. We found deviations from Bayesian inference consistent with "noisy" representations of posterior distributions (Acerbi, Vijayakumar & Wolpert, 2014; PLoS Computational Biology). We also investigated potential deviations from probabilistic inference in multisensory perception, in the paradigm known as perceptual causal inference, with mixed results (Acerbi*, Dokka*, Angelaki & Ma, 2018; PLoS Computational Biology). Both studies are characterized by a thorough Bayesian factorial model comparison, necessary to compare multiple alternative hypotheses with subtle differences. In a purely theoretical study, we examined the flexibility of Bayesian models and our ability as researchers to uniquely identify different model components (Acerbi, Ma & Vijayakumar, 2014; NeurIPS).
  2. Internal representations of priors and probability. We studied how people update probability of events that change over time (Norton, Acerbi, Ma & Landy, 2019; PLoS Computational Biology), and the shape of internal representations of distributions of temporal intervals (Acerbi, Wolpert & Vijayakumar, 2012; PLoS Computational Biology).
  3. We look at the role of uncertainty, a basic signature of Bayesian inference, in different perceptual domains, such as in elementary perceptual organization (Zhou*, Acerbi* & Ma, 2020; PLoS Computational Biology) and in visual working memory (Yoo, Acerbi & Ma, 2021; Journal of Vision).
Intelligence Under Resource Constraints

Time, memory and computational power are typical resource constraints that are natural to both artificial and biological systems, but differ in their quantity and quality due to implementation details of brains and machines. We are interested in studying the effects of such constraints within established paradigms such as reinforcement learning. Our goal is both to develop novel algorithmic solutions and gain insights on the functioning of the brain.

Highlight:

  1. While computers can easily store numbers with high precision, biological systems are limited in their capacity to process and store information. As a starting point of our broader research agenda, we explored how agents could allocate limited memory resources dynamically within a reinforcement learning paradigm (Patel, Acerbi & Pouget, 2020; NeurIPS).