Publications

This page contains a list of publications done by the members of the Exploratory Data Analysis Group since 2018.

More detailed lists, including earlier and most recent works, can be found via the People page.

2024

SLIPMAP: Fast and Robust Manifold Visualisation for Explainable AI.
Anton Björklund, Lauri Seppäläinen, Kai Puolamäki.
To appear in: Advances in Intelligent Data Analysis XXII. IDA 2024. Lecture Notes in Computer Science.

Using SLISEMAP to interpret physical data.
Lauri Seppäläinen, Anton Björklund, Vitus Besel, Kai Puolamäki.
Plos one 19 (1), e0297714
[https://doi.org/10.1371/journal.pone.0297714][code]

2023

A novel probabilistic source apportionment approach: Bayesian Auto-correlated Matrix Factorization.
Anton Rusanen, Anton Björklund, Manousos Manousakas, Jianhui Jiang, Markku T Kulmala, Kai Puolamäki, Kaspar R Daellenbach.
Atmospheric Measurement Techniques Discussions 2023, 1-28
[https://doi.org/10.5194/amt-2023-70]

χiplot: web-first visualisation platform for multidimensional data.
Akihiro Tanaka, Juniper Tyree, Anton Björklund, Jarmo Mäkelä & Kai Puolamäki.
Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track, Lecture Notes in Computer Science, pp. 335-339.
[https://doi.org/10.1007/978-3-031-43430-3_26][code][web application]

Explaining any black box model using real data.
Anton Björklund, Andreas Henelius, Emilia Oikarinen, Kimmo Kallonen, Kai Puolamäki.
Frontiers in Computer Science 5:1143904.
[https://doi.org/10.3389/fcomp.2023.1143904][code]

Model selection with bootstrap validation.
Rafael Savvides, Jarmo Mäkelä, Kai Puolamäki.
Statistical analysis and data mining. 16, 2, p. 162-186 25 p.
[https://doi.org/10.1002/sam.11606][code]

SLISEMAP: supervised dimensionality reduction through local explanations.
Anton Björklund, Jarmo Mäkelä, Kai Puolamäki.
Machine Learning 112, 1–43.
[https://doi.org/10.1007/s10994-022-06261-1][code][demo paper]

2022

Visual Data Exploration as a Statistical Testing Procedure: Within-view and Between-view Multiple Comparisons.
Rafael Savvides, Andreas Henelius, Emilia Oikarinen, Kai Puolamaki.
IEEE Transactions on Visualization and Computer Graphics.
[https://doi.org/10.1109/TVCG.2022.3175532][code]

Finding statistically significant high accident counts in exploration of occupational accident data.
Tuula Räsänen, Arto Reiman, Kai Puolamäki, Rafael Savvides, Emilia Oikarinen, Eero Lantto.
Journal of Safety Research.
[https://doi.org/10.1016/j.jsr.2022.04.003]

Incorporating expert domain knowledge into causal structure discovery workflows.
Jarmo Mäkelä, Laila Melkas, Ivan Mammarella, Tuomo Nieminen, Suyog Chandramouli, Rafael Savvides, Kai Puolamäki.
Biogeosciences 19 (8), 2095-2099
[https://doi.org/10.5194/bg-19-2095-2022]

Robust regression via error tolerance.
Anton Björklund, Andreas Henelius, Emilia Oikarinen, Kimmo Kallonen, Kai Puolamäki.
Data Mining and Knowledge Discovery 36 (2), 781-810
[https://doi.org/10.1007/s10618-022-00819-2][code]

2021

Machine-learning models to replicate large-eddy simulations of air pollutant concentrations along boulevard-type streets.
Moritz Lange, Henri Suominen, Mona Kurppa, Leena Järvi, Emilia Oikarinen, Rafael Savvides, and Kai Puolamäki.
Geoscientific Model Development 14 (12), 7411-7424
[https://doi.org/10.5194/gmd-14-7411-2021]

Interactive Causal Structure Discovery in Earth System Sciences.
Laila Melkas, Rafael Savvides, Suyog H Chandramouli, Jarmo Mäkelä, Tuomo Nieminen, Ivan Mammarella, Kai Puolamäki.
The KDD'21 Workshop on Causal Discovery, 3-25
[https://proceedings.mlr.press/v150/melkas21a.html][code]

Guided Visual Exploration of Relations in Data Sets.
Kai Puolamäki, Emilia Oikarinen, and Andreas Henelius.
Journal of Machine Learning Research. 22, 96, p. 1-32
[https://jmlr.org/papers/v22/19-364.html][code]

Low-Cost Outdoor Air Quality Monitoring and Sensor Calibration: A Survey and Critical Analysis.
Francesco Concas, Julien Mineraud, Eemil Lagerspetz, Samu Varjonen, Xiaoli Liu, Kai Puolamäki, Petteri Nurmi, and Sasu Tarkoma.
ACM transactions on sensor networks. 17, 2, p. 1-44 20.
[doi:10.1145/3446005]

Detecting virtual concept drift of regressors without ground truth values.
Emilia Oikarinen, Henri Elias Tiittanen, Andreas Henelius, and Kai Puolamäki.
Data Mining and Knowledge Discovery. 35, p. 726–747 22 p.
[doi:10.1007/s10618-021-00739-7]

2020

A Constrained Randomization Approach to Interactive Visual Data Exploration with Subjective Feedback.
Bo Kang, Kai Puolamäki, Jefrey Lijffijt, and Tijl De Bie.
IEEE Transactions on Knowledge and Data Engineering. 32, 9, p. 1666-1679 14 p.
[doi:10.1109/TKDE.2019.2907082]

Machine learning models to replicate large-eddy simulations of air pollutant concentrations along boulevard-type streets.
Moritz Lange, Henri Suominen, Mona Kurppa, Leena Järvi, Emilia Oikarinen, Rafael Savvides, and Kai Puolamäki.
Geoscientific Model Development.
[preprint doi:10.5194/gmd-2020-200]

Interactive visual data exploration with subjective feedback: an information-theoretic approach.
Kai Puolamäki, Emilia Oikarinen, Bo Kang, Jefrey Lijffijt, and Tijl De Bie.
Data Mining and Knowledge Discovery. 34, 1, p. 21–49 29 p.
[doi:10.1007/s10618-019-00655-x]

2019

Sparse Robust Regression for Explaining Classifiers.
Anton Björklund, Andreas Henelius, Emilia Oikarinen, Kimmo Kallonen, and Kai Puolamäki.
Proceedings of the 22nd International Conference on Discovery Science (DS 2019 Best Student Paper Award).
[doi:10.1007/978-3-030-33778-0_27] [pdf] [slides] [code]

Significance of Patterns in Data Visualisations.
Rafael Savvides, Andreas Henelius, Emilia Oikarinen, and Kai Puolamäki.
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '19).
[doi:10.1145/3292500.3330994] [video] [code]

Estimating regression errors without ground truth values.
Henri, Tiittanen, Emilia Oikarinen, Andreas Henelius, and Kai Puolamäki.
[preprint arXiv:1910.04069] [code]

A Constrained Randomization Approach to Interactive Visual Data Exploration with Subjective Feedback.
Bo Kang, Kai Puolamäki, Jefrey Lijffijt, and Tijl De Bie.
IEEE Transactions on Knowledge and Data Engineering.
[doi:10.1109/TKDE.2019.2907082]

Experimental study of cognitive limitations in a data-based judgement task.
Virpi Kalakoski, Andreas Henelius, Emilia Oikarinen, Antti Ukkonen, and Kai Puolamäki.
Behaviour & Information Technology.
[doi:10.1080/0144929X.2019.1657181]

Cognitive ergonomics for data analysis.
Virpi Kalakoski, Andreas Henelius, Emilia Oikarinen, Antti Ukkonen and Kai Puolamäki.
Proceedings of the 31st European Conference on Cognitive Ergonomics (ECCE 2019).
[doi:10.1145/3335082.3335112]

Randomization algorithms for large sparse networks.
Kai Puolamäki, Andreas Henelius, Antti Ukkonen.
Physical Review E 99(5): 053311, 2019.
[doi:10.1103/PhysRevE.99.053311] [code]

Supervised Human-guided Data Exploration.
Emilia Oikarinen, Kai Puolamäki, Samaneh Khoshrou, and Mykola Pechenizkiy.
Proceeding of the ECML PKDD Workshop on Automating Data Science, 2019.
[pdf] [code]

Interactive Visual Data Exploration with Subjective Feedback: An Information-Theoretic Approach.
Kai Puolamäki, Emilia Oikarinen, Bo Kang, Jefrey Lijffijt, and Tijl De Bie.
Data Mining and Knowledge Discovery.
[doi:10.1007/s10618-019-00655-x]

Guided Visual Exploration of Relations in Data Sets.
Kai Puolamäki, Emilia Oikarinen, and Andreas Henelius.
[preprint arXiv:1905.02515] [code]

Effects of live music therapy on heart rate variability and self-reported stress and anxiety among hospitalized pregnant women: A randomized controlled trial.
Pia Teckenberg-Jansson, Siiri Turunen, Tarja Pölkk, Minna-Johanna Lauri-Haikala, Jari Lipsanen, Andreas Henelius, and Minna Huotilainen.
Nordic Journal of Music Therapy, 28.
[doi:10.1080/08098131.2018.1546223]

2018

Tiler: Software for Human-Guided Data Exploration
Andreas Henelius, Emilia Oikarinen, and Kai Puolamäki.
Proceedings of European Conference of Machine Learning and Knowledge Discovery in Databases.
[doi:10.1007/978-3-030-10997-4_49] [pdf] [video] [code]

Interactive Visual Data Exploration with Subjective Feedback: An Information-Theoretic Approach
Kai Puolamäki, Emilia Oikarinen, Bo Kang, Jefrey Lijffijt, and Tijl De Bie.
Proceedings of the 34th IEEE International Conference on Data Engineering (ICDE 2018), pages 1208-1211.
[doi:10.1109/ICDE.2018.00112] [extended version arXiv:1710.08167] [code]

Subjectively Interesting Subgroup Discovery on Real-valued Targets
Jefrey Lijffijt, Bo Kang, Wouter Duivesteijn, Kai Puolamäki, Emilia Oikarinen, and Tijl De Bie.
Proceedings of the 34th IEEE International Conference on Data Engineering (ICDE 2018), pages 1352-1355.
[doi:10.1109/ICDE.2018.00148] [extended version arXiv:1710.04521]

MIDAS: Open-source framework for distributed online analysis of data streams
Andreas Henelius and Jari Torniainen.
SoftwareX 7 (2018): 156-161.
[doi:10.1016/j.softx.2018.04.004]

Biosignals reflect pair-dynamics in collaborative work: EDA and ECG study of pair-programming in a classroom environment
Lauri Ahonen, Benjamin Ultan Cowley, Arto Hellas, and Kai Puolamäki.
Scientific Reports 8(1): 3138, 2018.
[doi:10.1038/s41598-018-21518-3]