Propostas Atribuidas 2023/2024

Gerado a 2024-05-17 11:42:37 (Europe/Lisbon).

Voltar

Titulo Estágio

Automated Data Privacy Protection using Deep Learning and Causality Techniques

Áreas de especialidade

Sistemas de Informação

Local do Estágio

DEI

Enquadramento

In recent years, the widespread availability of vast amounts of data has facilitated significant advancements in various domains, including healthcare, finance, and marketing. However, this rapid growth in data collection and storage has raised concerns about individual privacy and the potential for re-identification of sensitive information. One important aspect of privacy protection is the anonymization of personal data, whereby quasi-identifiers are crucial factors in the process, particularly to assess the re-identification risk. Quasi-identifiers are attributes that, when combined, can potentially disclose sensitive information about individuals. The identification of quasi-identifiers is a process usually performed through manual analysis by users, which limits automated application of anonymization techniques. Other steps of the anonymization process also usually require manual intervention from users, such as the definition of appropriate generalization hierarchies.

Objetivo

This thesis aims to develop automated mechanisms for data privacy protection. It will explore the application of deep learning and causality techniques for identifying and ensuring privacy of quasi-identifiers in large-scale datasets. Deep learning, with its ability to automatically extract intricate patterns and features from complex data, has demonstrated significant success in various data analysis tasks. By leveraging deep learning architectures, we aim to develop models that can effectively identify quasi-identifiers present in the data, thus enabling the implementation of appropriate privacy protection measures. In addition to deep learning, we will incorporate causality techniques into the proposed framework. Causality analysis provides a deeper understanding of the relationships and dependencies between variables, which can help uncover causal factors that contribute to the identification of quasi-identifiers. By combining the strengths of deep learning and causality techniques, we anticipate achieving enhanced accuracy and interpretability in identifying and protecting quasi-identifiers and sensitive attributes. The outcomes of this research will contribute to the field of privacy protection by providing novel techniques for the identification and protection of quasi-identifiers.

Plano de Trabalhos - Semestre 1

Weeks 1-2: Literature Review:
-Conduct an in-depth review of relevant literature on quasi-identifiers, deep learning, causality techniques, and privacy protection.
-Identify key research gaps and challenges in the field.

Weeks 3-4: Data Collection and Preprocessing:
-Identify suitable datasets for experimentation.
-Collect and preprocess the data, ensuring data quality and integrity.
-Transform the data into a format suitable for deep learning and causality analysis.

Weeks 5-7: Feature Engineering and Model Design:
-Conduct feature engineering to extract relevant attributes and create meaningful representations.
-Design and implement deep learning architectures suitable for quasi-identifier identification.
-Validate the model design through experimentation on smaller subsets of the data.

Weeks 8-10: Model Training and Evaluation:
-Train the deep learning models using the collected data.
-Evaluate the performance of the models using appropriate evaluation metrics.
-Conduct comparative analysis with existing state-of-the-art methods to assess the effectiveness of the proposed approach.

Weeks 11-12: Preliminary Analysis and Reporting:
-Analyze the results obtained from the experiments.
-Identify strengths, limitations, and potential areas for improvement.
-Begin drafting the dissertation proposal, including an introduction, problem statement, research objectives, and proposed methodology.

Plano de Trabalhos - Semestre 2

Weeks 1-2: Refinement of Models and Techniques:
-Incorporate causality techniques into the deep learning framework to enhance quasi-identifier identification.
-Fine-tune the models based on the insights gained from the preliminary analysis.
Conduct sensitivity analysis and optimization to improve the models' performance.

Weeks 3-6: Extensive Experiments and Analysis:
-Perform comprehensive experiments on benchmark datasets.
-Evaluate the robustness, accuracy, and interpretability of the proposed models.
-Analyze and interpret the results obtained from the experiments in detail.

Weeks 7-10: Dissertation Writing:
-Develop the core chapters of the dissertation, including the methodology, experimental setup, results, and discussions.
-Integrate the findings from the experiments with the existing literature.
-Continuously refine and revise the dissertation drafts based on feedback from advisors and peers.

Weeks 11-12: Finalizing the Dissertation
-Complete the remaining chapters, including the introduction, conclusion, and abstract.
-Conduct a thorough review of the entire dissertation for coherence, clarity, and adherence to academic standards.
-Prepare and submit the final version of the dissertation.

Condições

Observações

Orientador

Pedro Henriques Abreu
pha@dei.uc.pt 📩