DASFAA 2021 Tutorial

Abstract

A critical issue in big data management is to address the data variety.

Data comes from disparate sources and may be presented in various models – structured, semi-structured, or unstructured. The increasing availability of multi-model data has triggered the development of Multi-Model DataBase (MMDB) systems. We have found 77 MMDBs among 334 DBMSs in the DB-Engines Ranking. These MMDBs typically integrate multiple data stores together to accommodate data in the formats that ﬁt the sources best, e.g., key/value pairs, relational tables, graphs, or XML/JSON documents. They also provide uniﬁed query languages, which allow users to retrieve data of diﬀerent models in a single query, i.e., cross-model query. This tutorial is to oﬀer a comprehensive investigation of these query languages and to make a comparative study on their processing paradigms.

Outline

The tutorial is divided into 5 parts:

Part I: Introduction of multi-model data management(5 minutes)

Part II: Database models and multi-model data query languages (10 minutes)

Part III: Cross-model query processing (25 minutes)

Part IV: Semantics and cross-model query optimization (15 minutes)

Part V: Open problem and challenges (5 minutes)

Part VI: Demonstration of cross-model query processing (30 minutes)

Presenters

Qingsong Guo is a Postdoctoral Researcher at the University of Helsinki. His research interests include multi-model data management and automatic management of big data with deep learning algorithms.

Jiaheng Lu is a Professor at the University of Helsinki. His main research interests lie in Big Data management and database systems. He has published more than one hundred journal and conference papers.

Chao Zhang is a senior Ph.D. candidate at the University of Helsinki. His research topic lies in multi-model database benchmarking.