Antti Klemetti defends his PhD thesis on Practical Approaches to Cost-Efficient Deep Learning

On Friday the 13th of February 2026, M.Sc. Antti Klemetti defends his PhD thesis on Practical Approaches to Cost-Efficient Deep Learning: Taxonomy, Experiments, and Industry Insights. The thesis is related to research done in the Department of Computer Science and in the Empirical Software Engineering group.

M.Sc. Antti Klemetti defends his PhD thesis "Practical Approaches to Cost-Efficient Deep Learning: Taxonomy, Experiments, and Industry Insights" on Friday the 13th of February 2026 at 13 in the University of Helsinki Athena building, Auditorium 107 (Siltavuorenpenger 3 A, 1st floor). His opponent is Associate Professor Alexander Jung (Aalto University) and custos Professor Jukka K. Nurminen (University of Helsinki). The defence will be held in English.

The thesis of Antti Klemetti is a part of research done in the Department of Computer Science and in the Empirical Software Engineering group at the University of Helsinki. His supervisors have been Professor Jukka K. Nurminen and University Researcher Mikko Raatikainen (University of Helsinki) as well as Professor Tommi Mikkonen (University of Jyväskylä).

Practical Approaches to Cost-Efficient Deep Learning: Taxonomy, Experiments, and Industry Insights

Artificial intelligence (AI) has become a strategic priority across industries and governments. The recent surge is driven by large language models (LLMs) built with deep neural networks (DNNs). DNN execution is among the most resource-intensive workloads in cloud data centers: large models contain billions of trainable parameters, rely on massively parallel matrix operations, and typically require specialized accelerators, such as Graphic Processing Units (GPU). Accelerator-equipped virtual machines (VMs) can cost an order of magnitude more than CPU-only VMs, and operational costs are growing accordingly.

This dissertation investigates approaches to make deep learning (DL) more cost-efficient. Our aims are to (i) catalog existing methods for researchers and practitioners and (ii) evaluate selected approaches in practical settings.

We designed and integrated AI cost–focused questions into the 2025 Finnish Software Industry Survey to gauge industry perceptions and expectations. We conducted a systematic literature review (SLR) to identify techniques for reducing DL cost and organized them into a practical taxonomy.

From the SLR, two complementary approaches emerge for lowering DL cost: reducing model size to make inference more efficient, and running training workloads on cost-efficient hardware. Accordingly, we (i) conducted an experimental study on structured pruning of tabular-data DNNs, comparing neuron removal with retraining from scratch to post-training pruning with fine-tuning to shrink models, and (ii) performed an industrial case study using cost-efficient revocable cloud spot instances, orchestrated with Ray, for fault-tolerant distributed training.

The survey responses indicate that AI's financial costs are already a material concern for Finnish software companies that develop or fine-tune AI models, and that AI's share of cloud or hardware spending is expected to rise significantly over the next three years. The SLR identifies multiple software-level methods (e.g., pruning, quantization, teacher–student) and system-level options (e.g., edge offloading, hardware acceleration), each with trade-offs.

In our experiments, compact architectures created by removing neurons and retraining from scratch matched or exceeded baseline accuracy while reducing inference latency, and they generally outperformed pruning followed by fine-tuning. However, the achievable cost savings are likely modest. In the industrial study, careful choice of data center and GPU type, together with fault-tolerant orchestration, cut training costs by nearly an order of magnitude, though at the expense of occasional long training job start delays.

The cost of DL is a practical industrial concern. There is no single "silver bullet". Instead, effective cost control combines model-level simplification, utilization improvements, and infrastructure choices (e.g., spot instances) aligned with tolerance for delay and interruptions. These findings motivate treating cost as a first-class objective alongside accuracy and latency and point to further work on training-time cost reduction and modern DNN architectures (e.g., Transformers) where compute and memory demands are dominant.

Avail­ab­il­ity of the dis­ser­ta­tion

An electronic version of the doctoral dissertation will be available in the University of Helsinki open repository Helda at .

Printed copies will be available on request from Antti Klemetti: .