Titulo Estágio
Yahoo! Cloud Serving Benchmark Extended (YCSB+e)
Áreas de especialidade
Engenharia de Software
Engenharia de Software
Local do Estágio
DEI-FCTUC
Enquadramento
Big data has risen as both a necessity and a problem in modern computing, with NoSQL databases becoming a prevalent solution for the increasing amount of data that companies have to deal with. Enterprise solutions such as MongoDB, Cassandra or Couchbase exist with different use cases and purposes and, hence, different applicability for different software quality attributes (e.g. write-performance, security, durability, etc). The Yahoo! Cloud Serving Benchmark (YCSB) was proposed in 2010 as a benchmarking solution for NoSQL systems, still being widely used as the most popular benchmark of its kind. However, YCSB is mostly focused with performance and scalability, and recent criticism has risen with regards to some of its shortcomings. There is, thus, a need for an improved solution for benchmarking, which allows the assessment of more quality attributes (e.g. security, availability, etc) and addresses criticisms of YCSB in terms of ease of use and simplistic experimental model.
Objetivo
The goal of this work is to develop an extension of YCSB, called YCSB+e — YCSB Extended. This extension should allow YCSB to evaluate more quality attributes, such as security, availability or consistency, and be easier and more automated to use.
In practice, the expected outcomes of this internship are:
- YCSB+e should be a drop-in replacement for YCSB, making the setup of benchmarks easier, facilitating tasks needed in every benchmark (e.g. network throughput tests and the automated generation of graphical plots) and allowing the evaluation of a wide range of quality attributes.
- The application (YCSB+e), as well as a series of tests with the proposed benchmark, showing its potential and contribution to Big Data research.
- A research paper, to be submitted and presented at a top international conference, describing the approach and main results obtained from the experiments.
Plano de Trabalhos - Semestre 1
[Some tasks might overlap; M=Month]
T1 (M1 – M3): Knowledge transfer and state of the art literature review on NoSQL benchmarking with YCSB.
T2 (M3) Research on quality attributes and methods for their evaluation in NoSQL scenarios using the information gathered in task T1 as basis. Selection of key quality attributes to implement in YCSB+e.
T3 (M3 – M4) Implementation of a proof of concept prototype.
T4 (M5): Writing the Intermediate report.
Plano de Trabalhos - Semestre 2
[Some tasks might overlap; M=Month]
T5 (M6): Integration of the intermediate defense comments and completion of the quality attributes and methods.
T6 (M6): Implementation of YCSB+e functionality related to experiment preconditions (e.g. network and disk throughput performance testing)
T7 (M6-M7): Implementation of YCSB+e functionality related to new quality attributes (including modifications to the Java code in YCSB regarding to popular NoSQL databases chosen appropriately)
T8 (M8): Implementation of YCSB+e functionality related to ease of use and automated result generation
T9 (M8): Execution of experiments and analysis of results.
T10 (M9): Write a research paper and submission to a top international conference on the Security and Data Warehousing areas (IEEE Big Data Congress, IEEE International Conference on Data Engineering – ICDE, International Conference on Very Large Data Bases - VLDB, etc.).
T11 (M10): Writing the thesis.
Condições
The work will be carried out in the facilities of the Department of Informatics Engineering at the University of Coimbra (CISUC - Software and Systems Engineering Group), where a work place and necessary computer resources will be provided.
Observações
A scholarship may be available (value to be defined) for at least part of the duration of the internship.
Orientador
Jorge Bernardino, Bruno Cabral
jorge@isec.pt 📩