Propostas Submetidas

DEI - FCTUC
Gerado a 2024-12-04 18:54:40 (Europe/Lisbon).
Voltar

Titulo Estágio

An environment for verifiable machine learning

Áreas de especialidade

Engenharia de Software

Sistemas Inteligentes

Local do Estágio

DEI-FCTUC

Enquadramento

This thesis aims to develop an integrated environment for verifying machine learning models. Artificial intelligence is being applied in several domains in which humans might be affected the (in)correctness of software systems. Examples of such domains include autonomous vehicles and health care. Verifying the correctness of software, including machine learning models, is fundamental.

Machine learning is the area of artificial intelligence that mainly deals with example-based supervised learning algorithms. The learned result is difficult to verify, because the specification is a set of examples and mistakes can be made in new cases that are not in that set. For this reason, statistical analysis, measuring for example accuracy, is the standard practice. That practice is insufficient for critical applications.

The goal of this thesis is to create an environment, by extending Visual Studio Code, that integrates existing verification tools into the machine learning pipeline. Docker shall be used to make the configuration efficient and portable across multiple platforms. The scikit-learn tools shall be used in conjunction with Jupyter Notebook for presentation and a translator to an existing verification backend shall be developed. At the end, it shall be possible do demonstrate the environment on distinct models and datasets.

Objetivo

The thesis has three main goals:

1. Develop an environment for verifying machine learning models, by integrating existing tools into a coherent framework that can be used to specify requirements and check if they are fulfilled by the models.

2. Construct the necessary algorithms to translate models into a common representation that can be accepted by existing verifiers.

3. Evaluate and demonstrate the environment by applying it to least two different machine learning models and two distinct datasets, to show the usefulness and generality of the proposed environment.

Plano de Trabalhos - Semestre 1

- State of the art (Months 1 and 2)

The first stage will consist in studying background knowledge on the topics related to the thesis. Namely, knowledge on machine learning, software verification, visual studio code and docker. At this point, the thesis chapter on the state of the art shall be drafted.

- Preparation of a case study and initial version of the algorithms (Months 3 and 4)

A first machine learning model shall be trained on a public dataset, to be selected, and the initial version of the algorithms to translate models into a common representation shall be developed. At the end of this stage, it should be possible to use basic functionality of visual studio code to run the machine learning pipeline, integrated with a verification backend, also to be selected. This will be the first functional prototype of the environment.

- Intermediate report (Month 5)

The tasks carried out during the first semester will be documented in the form of an intermediate report, followed by a public presentation and discussion. The most relevant topics at this stage are: context, problem statement, state of the art and preliminary discussion of the solution and its intended objectives.

Plano de Trabalhos - Semestre 2

- Development of the verification environment and final algorithms (Months 6 and 7)

At this stage the development of the proposed environment shall be completed, leading to a visual studio code extension and the necessary docker images. The environment should be usable and ready for evaluation.

- Evaluation of the proposed environment (Month 8)

The verification environment shall be evaluated by applying it to, at least, two distinct machine learning models and two different datasets. The goals is to demonstrate the usefulness and generality of the verification environment.

- Master’s thesis (Month 9)

The writing of the master's dissertation must be completed and the respective public presentation prepared. The dissertation must document all the work carried out, proposed solution, the results and the conclusions obtained.

Condições

A research scholarship will be opened to support the student during the period of full-time work.

The work will be carried out at the Department of Informatics Engineering of the University of Coimbra and a place of work will be made available in the laboratories of the DEI as well as the computational resources for carrying out experiments.

Observações

This work is carried out in the context of a research project and there will be the possibility of collaboration with project partners of the University of Coimbra.

Orientador

Raul Barbosa
rbarbosa@dei.uc.pt 📩