Propostas Submetidas - sem aluno

DEI - FCTUC
Gerado a 2024-03-28 10:17:12 (Europe/Lisbon).
Voltar

Titulo Estágio

Benchmarking Missing Data Imputation Algorithms in Data Processing Tools

Áreas de especialidade

Sistemas Inteligentes

Engenharia de Software

Local do Estágio

DEI

Enquadramento

Today, there are a lot of data processing tools that encompass different algorithms capable of helping data scientists in different data mining problems. However, from the user’s perspective, it is difficult to choose the best implementation of a specific algorithm to solve his problem. This task turns even more complex when the majority of the data processing tools include the same algorithms in their libraries.

Objetivo

Having the missing data problem in mind, the goal of this work is to analyze the performance of several algorithms in different tools. To achieve that, a pre selection of the most used data processing tools needs to be performed and some performance metrics must be used. It is important to note that at end of this work, the goal is not only to detect the best implementation of a specific algorithm; it is also to understand why that specific implementation presented better results.

Plano de Trabalhos - Semestre 1

-Select a set of open datasets capable of encompassing different realities in terms of size, percentage of missing data, kind of variables, among other aspects
-Learn about missing data problem
-Select data processing tools
-Choose the criteria to evaluate the different machine learning algorithms

Plano de Trabalhos - Semestre 2

-Implement and test different algorithms in different data processing tools approaches
-Evaluate the performance of the algorithms identifying substantial differences in their implementation
-Writing of scientific article and dissertation

Condições

The work is not financed

Orientador

Pedro Henriques Abreu
pha@dei.uc.pt 📩