Summer internship projects 2019

1. Query optimization on multi-model databases

 

Background:

As more businesses realized that data, in all forms and sizes, is critical to making the best possible decisions, we see the continued growth of systems that support massive volume of non-relational or unstructured forms of data. The research focus of this job is to develop new principles and algorithms for a novel unified database management system to manage both well-structured data and NoSQL data.

Qualification:

This project will require JAVA programming and experience using relational databases and NoSQL databases.

 

2. Job optimization and parameter tuning on Spark and Hive big data platform

 

Background:

A cloud-hosted application is expected to support millions of end users with terabytes of data. To accommodate such large-scale workloads, it is common to deploy thousands of servers in one data center. Meanwhile, existing big data platforms (e.g., Spark and Hive) employ naive job optimization algorithms, which consider neither heterogeneity of resources nor differences of jobs. This motivates an advanced job optimization and parameter tuning in big data environments.

Qualification:

This project will require JAVA/Python/Scala programming and experience using Spark and Hadoop platform.