Titulo Estágio
Data Cubes Anonymity in Real-time OLAP systems
Áreas de especialidade
Engenharia de Software
Local do Estágio
SSE - CISUC - DEI -FCTUC
Enquadramento
General Data Protection Regulation (GDPR) (EU) 2016/679 is a regulation in EU law on data protection and privacy for all individuals within the European Union (EU). The GDPR aims primarily to give control to citizens and residents over their personal data and to simplify the regulatory environment for international business by unifying the regulation within the EU.
Therefore, information sharing with appropriate privacy protection is one of the most critical challenges of our time. Today, aggregate data need to be analyzed and processed from sources that contain confidential data. This is the case for example when data are collected after a confidentiality agreement or when the data are by their nature protected under personal data legislation. Data anonymization by the use of data cubes is an approach which helps to circumvent this obstacle and thus allowing an interested party to have access to aggregate results based on the afore mentioned data.
In this internship, we intend to propose and evaluate anonymization algorithms and techniques focused on Real-time OLAP systems, such as Apache Kylin.
Objetivo
In practice, the expected outcomes of this internship are:
- Evaluate the existing anonymization algorithms.
- Adapt these anonymization algorithms and techniques for streaming.
- Propose new approaches for integrating these techniques into OLAP engines.
- A research paper, to be submitted and presented at a top international conference, describing the approach and main results obtained from the experiments.
Plano de Trabalhos - Semestre 1
[Some tasks might overlap; M=Month]
T1 (M1 – M3): Knowledge transfer and state of the art literature review on anonymization algorithms and techniques.
T2 (M3) Design new algorithms and techniques for OLAP Systems, using the information gathered in task T1 as basis.
T3 (M3) Identification of target systems to be used in the experiments.
T4 (M3 – M4) Implementation of a proof of concept prototype.
T5 (M5): Writing the Intermediate report.
Plano de Trabalhos - Semestre 2
[Some tasks might overlap; M=Month]
T1 (M6): Integration of the intermediate defense comments and completion of anonymization algorithms and techniques for OLAP Systems.
T2 (M6 – M7): Integrating these techniques into CEP engines.
T3 (M8): Execution of experiments and analysis of results.
T4 (M9): Write a research paper and submission to a top international conference on Database area (IEEE Big Data Congress, Database Systems for Advanced Applications - DASFAA, IEEE International Conference on Data Engineering – ICDE, etc.).
T5 (M10): Writing the thesis.
Condições
The work will be carried out in the facilities of the Department of Informatics Engineering at the University of Coimbra (CISUC - Software and Systems Engineering Group), where a work place and necessary computer resources will be provided.
Orientador
Bruno Cabral, Jorge Bernardino
bcabral@dei.uc.pt 📩