Titulo Estágio
Safety of machine learning in critical applications: An architectural approach
Áreas de especialidade
Engenharia de Software
Sistemas Inteligentes
Local do Estágio
DEI-FCTUC
Enquadramento
This thesis aims to develop an architectural approach for safety of machine learning models in critical applications. In domains such as autonomous vehicles and health care, the safety of the software systems is of the greatest importance. This creates big challenges for artificial intelligence components, namely those based on machine learning.
Machine learning mainly deals with example-based supervised learning algorithms. The result is a model that is difficult to examine and to understand. It can also make mistakes because the specification is a set of examples and new cases not belonging to the set can be seen in the real world. In this context, this thesis aims to develop an architectural approach to ensure error detection and recovery for machine learning models. In other words, safety is to be achieved by detecting potential errors and recovering from them. For this reason, this thesis aims to design an architecture to guarantee safety properties of machine learning models.
The thesis shall develop an architecture to complement machine learning models with a runtime monitor. When potential errors are detected by the monitor, recovery is necessary. The thesis will develop and evaluate different recovery techniques, in order to support the thesis goals. Tools such as scikit-learn shall be used for the development, possibly including Docker to make configuration portable across multiple platforms and Jupyter Notebook for presentation. At the end, it shall be possible do demonstrate and evaluate the architecture on distinct models and datasets.
Objetivo
1. Develop an approach that incorporates an online supervisor to monitor machine learning models during execution and activate recovery actions whenever a potential error is detected.
2. Construct and implement the necessary algorithms for the online monitor and to integrate the architectural components into the machine learning pipeline.
3. Prepare a case study, using at least two datasets, to train the models and run the necessary experiments to comparatively evaluate the results of the proposed approach.
Plano de Trabalhos - Semestre 1
- State of the art (Months 1 and 2)
The first stage will consist in studying background knowledge on the topics related to the thesis. Namely, knowledge on machine learning, error detection, fault tolerance and tools such as scikit-learn and docker. At this point, the thesis chapter on the state of the art shall be drafted.
- Preparation of a case study and initial version of the algorithms (Months 3 and 4)
A first machine learning model shall be trained on a public dataset, to be selected. The initial version of the solution, including a monitoring component, shall be developed. At the end of this stage, it should be possible to demonstrate a first functional prototype of the system.
- Intermediate report (Month 5)
The tasks carried out during the first semester will be documented in the form of an intermediate report, followed by a public presentation and discussion. The most relevant topics at this stage are: context, problem statement, state of the art and preliminary discussion of the solution and its intended objectives.
Plano de Trabalhos - Semestre 2
- Development of the architectural solution and the recovery strategies (Months 6 and 7)
At this stage the development of the proposed solution shall be completed, including the implementation of the necessary algorithms for the online monitor and to correct errors. This includes possibilities such as using ensembles or re-executing with a different input. The full environment should be usable and ready for evaluation.
- Evaluation of the proposed solution (Month 8)
The proposed solution shall be evaluated by applying it to, at least, two distinct machine learning models and two different datasets. The goal is to demonstrate the usefulness and effectiveness of the solution in correcting potential errors.
- Master’s thesis (Month 9)
The writing of the master's dissertation must be completed and the respective public presentation prepared. The dissertation must document all the work carried out, proposed solution, the results and the conclusions obtained.
Condições
The work will be carried out at the Department of Informatics Engineering of the University of Coimbra and a place of work will be made available in the laboratories of the DEI as well as the computational resources for carrying out experiments.
Observações
This work is carried out in the context of a research project and there will be the possibility of collaboration with project partners of the University of Coimbra.
Orientador
Raul Barbosa
rbarbosa@dei.uc.pt 📩