Publications

Publications from the Machine and Human Intelligence group.

Our research is regularly published in top-tier machine learning conferences, such as NeurIPS, AISTATS, ICML, and in renowned computational journals, such as PLoS Computational Biology. See below for a detailed list of our publications.

For an up-to-date list, see Luigi Acerbi’s Google Scholar page here.
Our papers are available in LLM-friendly Markdown format here.

Preprints

This list of preprints may be incomplete — for a full list, see Google Scholar.

Li C, Clarté G, Jørgensen M, Acerbi L
Fast post-process Bayesian inference with Variational Sparse Bayesian Quadrature
arXiv preprint
arXiv | Code (GitHub)
This work introduces Variational Sparse Bayesian Quadrature (VSBQ) as a technique to perform what we call "post-process" Bayesian inference for posterior and model inference. The idea is that you may have obtained a large number of likelihood or posterior evaluations from preliminary runs of other algorithms (e.g., optimization runs for the purpose of maximum-a-posteriori estimation). VSBQ builds an approximate Bayesian posterior reusing all these evaluations, regardless of their source, yielding often a good approximation.
Findling C, Hubert F, International Brain Laboratory, Acerbi L, Benson B, Benson J, Birman D, Bonacchi N, Carandini M, Catarino JA, Chapuis GA, Churchland AK, Dan Y, DeWitt EE, Engel TA, Fabbri M, Faulkner M, Fiete IR, Freitas-Silva L, Gercek B, Harris KD, Hausser M, Hofer SB, Hu F, Huntenburg JM, Khanal A, Krasniak C, Langdon C, Latham PE, Lau PY, Meijer GT, Miska NJ, Mrsic-Flogel TD, Noel J-P, Nylund K, Paninski L, Pan-Vazquez A, Pillow J, Rossant C, Roth N, Schaeffer R, Schartner M, Shi Y, Socha KZ, Steinmetz NA, Svoboda K, Tessereau C, Urai AE, Wells MJ, West SJ, Whiteway MR, Winter O, Witten IB, Zador T, Dayan P, Pouget A
Brain-wide representations of prior information in mouse decision-making
bioRxiv preprint
bioRxiv
This large collaboration from the International Brain Laboratory (IBL) investigates how the brain represents prior information about the world. IBL experimental labs recorded the neural activity of mice as they performed a visual task. Mice were able to estimate the probability of certain visual stimuli appearing based on prior experience, and this estimation improved their decision-making accuracy. Surprisingly, this subjective prior information was encoded in various brain regions, spanning from sensory areas to motor and high-level cortical regions, suggesting a widespread neural network involved in Bayesian inference. This study provides new evidence of how prior information is represented in the brain, highlighting the significance of large-scale recordings for a deeper understanding of neural processes.

2025

Li C, Huggins B, Mikkola P, Acerbi L
Normalizing Flow Regression for Bayesian Inference with Offline Likelihood Evaluations
In Proc. 7th Symposium on Advances in Approximate Bayesian Inference (AABI '25 proceedings track)
arXiv | Code (GitHub) | Webpage
Normalizing Flow Regression (NFR) is a method for Bayesian inference with expensive likelihoods that learns the posterior distribution directly by regressing on existing log-density evaluations, avoiding extra sampling steps. We introduce tailored training techniques and show that NFR is effective on synthetic and real-world problems, making it a promising approach when computations are prohibitive and past model evaluations can be reused.
Chang PE*, Loka S*, Huang D*, Remes U, Kaski S, Acerbi L
Amortized Probabilistic Conditioning for Optimization, Simulation and Inference
To appear in The 28th International Conference on Artificial Intelligence and Statistics (AISTATS 2025)
arXiv | Code (GitHub) | Webpage
We introduce the Amortized Conditioning Engine (ACE), a transformer-based meta-learning model that affords flexible conditioning on both observed data and interpretable latent variables, incorporating user-provided priors at runtime. ACE generates predictive distributions for discrete and continuous data and latents, demonstrating versatility across multiple tasks like image completion, Bayesian optimization, and simulation-based inference. Our approach unifies probabilistic conditioning and prediction into a single amortization framework, addressing challenges that previously required bespoke solutions.
Noel JP, Balzani E, Acerbi L, Benson J, The International Brain Laboratory, Savin C, Angelaki DE
A common computational and neural anomaly across mouse models of autism
To appear in Nature Neuroscience
Link | Code (GitHub)
This study shows that mouse models of autism with different genetic mutations share a common computational anomaly: inflexible updating of prior expectations during decision-making. This inflexibility is linked to altered cortical encoding, with a shift from sensory to frontal areas and disrupted neural responses to unexpected stimuli.

2024

Mikkola P, Acerbi L*, Klami A*
Preferential Normalizing Flows
In Proc. Advances in Neural Information Processing Systems 37 (NeurIPS '24), Vancouver, Canada. (* equal contribution)
arXiv
We introduce a method to elicit high-dimensional probability distributions from experts using preferential questions, leveraging normalizing flows. To address the challenges in flow estimation, we derive a novel functional prior for the flow in preferential settings. The method is demonstrated on simulated experts, including the extraction of prior beliefs from a large language model.
Huang D, Guo Y, Acerbi L, Kaski S
Amortized Bayesian Experimental Design for Decision-Making
In Proc. Advances in Neural Information Processing Systems 37 (NeurIPS '24), Vancouver, Canada.
arXiv
We propose an amortized decision-aware Bayesian experimental design (BED) framework that optimizes experimental designs for downstream decision-making. The novel Transformer Neural Decision Process (TNDP) architecture simultaneously suggests designs for experiments and infers final decisions, improving both design quality and decision accuracy in various tasks.
Trinh T, Heinonen M, Acerbi L, Kaski S
Improving robustness to corruptions with multiplicative weight perturbations
In Proc. Advances in Neural Information Processing Systems 37 (NeurIPS '24), Vancouver, Canada as a spotlight.
arXiv
We introduce Data Augmentation via Multiplicative Perturbation (DAMP), a method that improves the robustness of deep neural networks to data corruptions without sacrificing accuracy on clean data. DAMP simulates input perturbations through random multiplicative weight perturbations during training, outperforming traditional data augmentation techniques across different datasets, architectures, and tasks, from image classification to large language model finetuning.
Schmitt M, Li C, Vehtari A, Acerbi L, Bürkner PC, Radev ST
Amortized Bayesian Workflow (Extended Abstract)
NeurIPS 2024 Workshop on Bayesian Decision-making and Uncertainty.
arXiv
We propose an adaptive Bayesian workflow that combines the speed of amortized inference with the accuracy of MCMC, dynamically switching from faster to more expensive methods based on tailored diagnostics to ensure high-quality posterior samples. By re-using computations from earlier steps, our workflow achieves efficient and accurate inference across many datasets while significantly reducing computational costs.
Huang D, Guo Y, Acerbi L, Kaski S
Amortized Decision-Aware Bayesian Experimental Design
NeurIPS 2024 Workshop on Bayesian Decision-making and Uncertainty.
Link
Workshop version of Amortized Bayesian Experimental Design for Decision-Making.
Singh GS, Acerbi L
PyBADS: Fast and robust black-box optimization in Python
Journal of Open Source Software, 9(94), 5694. DOI: https://doi.org/10.21105/joss.05694
Link | Code (GitHub) | Website
PyBADS is a well-documented, open-source Python implementation of our popular Bayesian Adaptive Direct Search (BADS) for black-box optimization with applications in computational model fitting.
Trinh T, Heinonen M, Acerbi L, Kaski S
Input gradient diversity for neural network ensembles
In12th International Conference on Learning Representations (ICLR 2024), spotlight (top 5% papers).
arXiv
Deep ensembles are typically created by training different neural networks from different weight initializations. However, this approach often leads to redundancy and inefficiencies. Instead, we introduce a new ensemble learning method that promotes diversity in the space of input gradients, which uniquely characterize a function and are smaller in dimension than the weights. Essentially, this encourages each network in the ensemble to learn different features, improving robustness.

2023

Huang D, Haussmann M, Remes U, John ST, Clarté G, Luck KS, Kaski S, Acerbi L
Practical Equivariances via Relational Conditional Neural Processes
In Proc. Advances in Neural Information Processing Systems 36 (NeurIPS '23), New Orleans, USA.
arXiv | Code (GitHub)
We propose an extension to Conditional Neural Processes to effectively handle equivariances, such as translational equivariance ("stationarity") and equivariance to rigid transformations ("isotropy"). By using relative rather than absolute information, our method offers a practical solution that increases efficiency and performance when the task naturally includes such equivariances, while also scaling easily to higher dimensional inputs (D > 2), unlike existing approaches. This work is a joint collaboration of the Amortized inference FCAI team.
Huang D, Bharti A, Souza A, Acerbi L, Kaski S
Learning Robust Statistics for Simulation-based Inference under Model Misspecification
In Proc. Advances in Neural Information Processing Systems 36 (NeurIPS '23), New Orleans, USA.
arXiv | Code (GitHub)
This paper presents a novel approach to handling model misspecification in simulation-based inference. The issue is that the data generating process usually doesn't perfectly match the selected model, resulting in a "misspecified" model that could yield inaccurate or misleading results. We propose a general solution that involves penalizing statistics that increase the mismatch between the data and the model, thereby ensuring more robust inference in misspecified scenarios.
Aushev A, Putkonen A, Clarté G, Chandramouli S, Acerbi L, Kaski S, Howes A
Online simulator-based experimental design for cognitive model selection
Computational Brain & Behavior. DOI: https://doi.org/10.1007/s42113-023-00180-7
Link | arXiv | Code (GitHub)
An attempt towards building methods for online optimal experimental design that performs both parameter and model inference, with applications to cognitive science.
Huggins B*, Li C*, Tobaben M*, Aarnos MJ, Acerbi L
PyVBMC: Efficient Bayesian inference in Python
Journal of Open Source Software, 8(86), 5428. DOI: https://doi.org/10.21105/joss.05428 (* equal contribution)
Link | Code (GitHub) | Website
PyVBMC is a well-documented, open-source Python implementation of the Variational Bayesian Monte Carlo (VBMC) algorithm for posterior and model inference for black-box computational models.

2022

Trinh T, Heinonen M, Acerbi L, Kaski S (2022)
Tackling covariate shift with node-based Bayesian neural networks
In 39th International Conference on Machine Learning (ICML 2022) as long presentation (oral; top 2% papers).
Link | arXiv | Code (GitHub) | Website
Node-based Bayesian neural networks (node-BNNs) have demonstrated good generalization under input corruptions. In this work, we provide insight into the robustness of node-BNNs by connecting input corruptions, data augmentation and posterior entropy. Building on top of this intuition, we propose a simple entropy-based method to further improve the robustness of node-BNNs.
de Souza DA, Mesquita D, Kaski S, Acerbi L (2022)
Parallel MCMC Without Embarrassing Failures
In Proceedings of The 25th International Conference on Artificial Intelligence and Statistics (AISTATS 2022), PMLR 151:1786-1804.
Link | arXiv | Code (GitHub)
In this work, we elucidate the hidden failure modes of popular "embarrassingly parallel" MCMC methods that practitioners should be aware of. We propose active learning as a new solution to address these issues, via our Parallel Active Inference (PAI) algorithm. Finally, we demonstrate that PAI addresses these issues with several proof-of-concept examples and a more realistic scenario from computational neuroscience.

2021

Yoo AH, Acerbi L, Ma WJ (2021)
Uncertainty is Maintained and Used in Working Memory
Journal of Vision 21(8):13. DOI: https://doi.org/10.1167/jov.21.8.13.
Link | Code (GitHub)
What are the contents of working memory? In both behavioral and neural computational models, the working memory representation of a stimulus is typically described by a single number, namely a point estimate of that stimulus. Here, we asked if people also maintain the uncertainty associated with a memory, and if people use this uncertainty in subsequent decisions.

2020

van Opheusden B*, Acerbi L*, Ma WJ (2020)
Unbiased and Efficient Log-Likelihood Estimation with Inverse Binomial Sampling
PLoS Computational Biology 16(12): e1008483. DOI: 10.1371/journal.pcbi.1008483 (* equal contribution)
Link | arXiv | Code (GitHub) | Tweeprint
Many models (not limited to psychology and neuroscience) do not have a log-likelihood in closed form, but we can easily sample observations from the model (i.e., via simulation). Inverse binomial sampling (IBS) is a technique to estimate log-likelihood via sampling in an efficient and unbiased way that, unlike similar methods, does not use summary statistics but the full data. IBS enables likelihood-based inference for models without accessible likelihoods!
Acerbi L (2020)
Variational Bayesian Monte Carlo with Noisy Likelihoods
In Proc. Advances in Neural Information Processing Systems 33 (NeurIPS '20), Montréal, Canada.
Link | arXiv | Code (GitHub) | Tweeprint
We extend Variational Bayesian Monte Carlo (VBMC) to perform sample-efficient Bayesian posterior and model inference also with noisy likelihood evaluations, such as those obtained via simulation (i.e., sampling). We tested VBMC with many models and real data from computational and cognitive neuroscience, up to D = 9 parameters. The new versions of VBMC vastly outperform previous methods (including older versions of VBMC), and inference is still quite fast thanks to the combo of variational inference + Bayesian quadrature.
Patel N, Acerbi L, Pouget A (2020)
Dynamic allocation of limited memory resources in reinforcement learning
In Proc. Advances in Neural Information Processing Systems 33 (NeurIPS '20), Montréal, Canada.
Link | arXiv | Code (GitHub)
In this work we propose a dynamical framework to maximize expected reward under constraints of limited memory, such as those experienced by biological brains. We derive from first principles an algorithm, Dynamic Resource Allocator, which we apply to standard tasks in reinforcement learning and a model-based planning task, and find that it allocates more resources to items in memory that have a higher impact on cumulative rewards. This work provides a normative solution to the problem of online learning of how to allocate costly resources to a collection of uncertain memories.
Zhou Y*, Acerbi L*, Ma WJ (2020)
The Role of Sensory Uncertainty in Simple Contour Integration
PLoS Computational Biology 16(11): e1006308. DOI: 10.1371/journal.pcbi.1006308 (*equal contribution)
Link | Data and code (GitHub)
Our percept of the world is governed not only by the sensory information we have access to, but also by the way we interpret this information as part of a whole ("perceptual organization"). This study examines whether and how people incorporate uncertainty into perceptual organization, by varying sensory uncertainty from trial to trial in a contour integration task, an elementary form of perceptual organization. We found that people indeed take into account sensory uncertainty, however in a way that subtly deviates from optimal behavior.

2019

Norton EH, Acerbi L, Ma WJ, Landy MS (2019)
Human online adaptation to changes in prior probability
PLoS Computational Biology 15(7): e1006681. DOI: 10.1371/journal.pcbi.1006681
Link | Data and code (GitHub)

How do people learn and adapt to changes in the probability of events? We addressed this question with two psychophysical tasks that involved categorization of visual stimuli, where the probability of the categories jumped over the course of the experiment. Using Bayesian model comparison and a handful of different observer models, we found that human data are explained best by a model that estimates category probability based on recently observed exemplars, with a bias towards equal probability. Interestingly, one of the tasks is virtually the same as the mouse decision-making task used by the International Brain Laboratory.
Acerbi L (2019)
An Exploration of Acquisition and Mean Functions in Variational Bayesian Monte Carlo
In Proc. Machine Learning Research 96: 1-10. 1st Symposium on Advances in Approximate Bayesian Inference, Montréal, Canada
Link
In this paper, we tried to find better acquisition functions or mean functions for VBMC, but we did not manage to find anything that would work significantly better than what is in the 2018 VBMC paper. In a positive light, this work shows that our original choices for VBMC were pretty good.

2018

Acerbi L (2018)
Variational Bayesian Monte Carlo
In Proc. Advances in Neural Information Processing Systems 31 (NeurIPS '18), Montréal, Canada
Link | arXiv | Code (GitHub) | Tweeprint
This paper introduced VBMC, a novel machine learning method to perform Bayesian posterior and model inference when the model likelihood is moderately expensive to evaluate. Take Bayesian optimization, but instead of computing only a point estimate, VBMC returns a full approximate posterior and a lower bound on the log model evidence (the ELBO), useful for model comparison. VBMC combines variational inference and active-sampling Bayesian quadrature, and vastly improves over previous seminal Bayesian quadrature methods. We also released VBMC as an user-friendly MATLAB toolbox.
Acerbi L*, Dokka K*, Angelaki DE, Ma WJ (2018)
Bayesian comparison of explicit and implicit causal inference strategies in multisensory heading perception
PLoS Computational Biology 14(7): e1006110. DOI: 10.1371/journal.pcbi.1006110 (*equal contribution)
Link | Data and code (GitHub)
How do people combine information from vision and from their vestibular sense to know in which direction they are moving? We originally thought that psychophysical data collected by our collaborators at Baylor College of Medicine would be able to strongly distinguish competing models of multisensory perception. However, behavioral models have many plausible tweaks (e.g., observer assumptions, heteroskedasticity, decision noise), and when we allowed for those, the models became less distinguishable. Thus, the story became more methodological: how to comprehensively compare models of multisensory perception in a robust, principled way.

2017

Acerbi L, Ma WJ (2017)
Practical Bayesian Optimization for Model Fitting with Bayesian Adaptive Direct Search
In Proc. Advances in Neural Information Processing Systems 30 (NeurIPS '17), Long Beach, USA
Link | arXiv (article + supplement) | Code (GitHub)
This paper introduced BADS, a novel optimization method to peform fast and robust hybrid Bayesian optimization. BADS works with both noiseless and noisy objective functions, and it outperforms many optimizers on real model fitting problems. BADS is currently used in dozens of computational labs across the world and is available as an user-friendly MATLAB toolbox.
Acerbi L, Vijayakumar S, Wolpert DM (2017)
Target Uncertainty Mediates Sensorimotor Error Correction
PLoS ONE 12(1): e0170466
Link | Data
The ability to correct for errors that arise from unreliable perceptions and motor commands is essential to human dexterity. In this paper, we examined how participants correct for movement errors in a naturalistic task. Even though participants had ample time to compensate for experimentally-induced perturbations, their amount of correction was affected by uncertainty about the target location. In fact, our analyses suggest that participants were optimally lazy, limiting their effort to just as much as needed so as not to significantly affect their overall performance in the task, consistent with theories of stochastic optimal control.

2014

Acerbi L, Ma WJ, Vijayakumar S (2014)
A Framework for Testing Identifiability of Bayesian Models of Perception
In Proc. Advances in Neural Information Processing Systems (NeurIPS '14), Montreal, Canada
Link | Paper (PDF) | Appendix (PDF)
Bayesian observer models are very effective at describing human performance in perceptual tasks, so much so that they are trusted to faithfully recover hidden mental representations from the data. However, the intrinsic degeneracy of the Bayesian framework, as multiple combinations of elements can yield empirically indistinguishable results, prompts the question of model identifiability. In this work, we proposed a novel framework for a systematic testing of the identifiability of a significant class of Bayesian observer models, with practical applications for improving experimental design.
Acerbi L, Vijayakumar S, Wolpert DM (2014)
On the Origins of Suboptimality in Human Probabilistic Inference
PLoS Computational Biology 10(6): e1003661
Link | Data
The process of decision making involves combining sensory information with statistics collected from prior experience. In this study, we used a novel experimental setup to examine the role of complexity of prior experience on suboptimal decision making. Participants' performance in our task, which did not require subjects to remember past events, was mostly unaffected by the complexity of the prior distributions, suggesting that remembering the patterns of past events constitutes more of a challenge to decision making than manipulating the complex probabilistic information. We introduced a mathematical description that captures the pattern of human responses in our task better than previous accounts.

Before 2014

Acerbi L, Dennunzio A, Formenti E (2013)
Surjective multidimensional cellular automata are non-wandering: A combinatorial proof
Information Processing Letters 113(5-6): 156-159
Link
Acerbi L, Wolpert DM, Vijayakumar S (2012)
Internal Representations of Temporal Statistics and Feedback Calibrate Motor-Sensory Interval Timing
PLoS Computational Biology 8(11): e1002771
Link | Data
Acerbi L, Dennunzio A, Formenti E (2009)
Conservation of Some Dynamical Properties of Operations on Cellular Automata
Theoretical Computer Science, 410(38-40): 3685-3693
Link
Acerbi L, Dennunzio A, Formenti E (2007)
Shifting and Lifting of Cellular Automata
In Proc. Third Conference on Computability in Europe, CiE 2007: 1-10; June 2007
Link

Thesis

Acerbi L (2015)
Complex internal representations in sensorimotor decision making: a Bayesian investigation
Doctoral dissertation, The University of Edinburgh. Advisors: Prof. Sethu Vijayakumar, Prof. Daniel M. Wolpert
Link | PDF

Disclaimer: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.