Titulo Estágio
An optimization approach to support a missing data process
Áreas de especialidade
Sistemas Inteligentes
Engenharia de Software
Local do Estágio
DEI
Enquadramento
Nowadays, Expert Systems (ES) are used in healthcare environments to support physicians in their daily activity. Those systems support their findings in clinical information that can encompass medical images or patient records, among others. One of the biggest problems that ES need to face is missing data, which consists in the lack of information in patient records that will influence, at the end, the performance of a classifier. To solve this problem, many solutions have emerged during the years but none of them presents enough generalization to become a standard in the area.
Objetivo
The main goal of this work is to optimize an automatic missing data imputation process that actually uses a brute force approach to solve the problem. For that, the student will need to define different optimization strategies to adopt inside the pre-defined process (avoiding the test of all the combinations as in the brute force approach). At the end, the performance of each of the used strategies will be analyzed as well as the performance of the classifiers based on the dataset created by each of the optimization strategies.
Plano de Trabalhos - Semestre 1
-Select a set of open datasets capable of encompassing different realities in terms of size, percentage of missing data, kind of variables, among other aspects
-Learn about missing data types
-Learn about machine learning algorithms including classifiers and imputation algorithms
-Pre-define the optimization strategies that will be used in the next semester
Plano de Trabalhos - Semestre 2
-Implement different optimization approaches
-Missing data imputation and data classification experiments,
-Evaluate the results and discuss the findings
-Writing of scientific article and dissertation
Condições
The work is not financed.
Orientador
Pedro Henriques Abreu
pha@dei.up.pt 📩