In recent years, availability of abundant complex data, increases in parallel computation power and new machine learning approaches such as deep learning have led to breakthroughs in many fields, and have already begun to change biomedicine as a whole.

Our team focuses on developing new integrative and interpretable machine learning methods for multilayered data in biomedicine, and translating our results into clinical practice.

Collaborative science

Our team works closely with domain experts such as clinicians and geneticists. Many of our current projects concern blood and cancers: for instance, we are interested in learning the determinants of hematological phenotypes by utilizing high-throughput genotyping, sequencing and imaging data. We are participating in both the international Pan-Cancer Analysis of Whole Genomes (PCAWG) project as well as projects at the Academic Medical Center Helsinki campus through the Applied Tumor Genomics (ATG) research program. Two major research projects which provide synergy to our efforts are the iCAN Digital Precision Cancer Medicine Flagship and FinnGen. Both of these projects aim at integrating genetic and medical data for improving treatment of diseases and our understanding of phenomena underlying diseases.


Collaboration with EMBL Heidelberg and DKFZ aims at modeling somatic mutagenesis in cancers with deep machine learning. Utilizing data generated by international cancer sequencing projects (ICGC, TCGA), Applied Tumor Genomics (ATG) and individual research groups, we build integrative models of multilayered data to shed light on the causes and consequences of somatic mutations.

Together with Finnish Red Cross Blood Service we strive to understand the phenotypic variation in blood cells and its determinants, focusing on variability of NK cell phenotypes. We combine high-throughput single cell imaging data with genotyping and longitudinal register data on blood donors.

With the Hematological Genetics research group we identify germline determinants of hematological malignancies by modeling high-throughput multiomics data. A particular focus will be on DNA repair deficiencies contributing to mutational burden in malignancies such as acute leukemias.

The AdaGe consortium, funded by the Academy of Finland Molecular Regulatory Networks of Life (R'Life) program and composed of research groups at VTT and University of Helsinki, aims to delineate the interplay of genome structure and metabolic network in cellular adaptive evolution. Here our team will create machine learning models to analyze multiomics data from Saccharomyces cerevisiae strains engineered with CRISPR-Cas9.

In the iCAN project we will explore integrative machine learning in a precision cancer medicine setting - novel methods and models will be deployed in the secure iCAN data lake environment, with results accessible by clinicians.

Funding and support