Propostas atribuidas 2024/2025

DEI - FCTUC
Gerado a 2024-07-17 07:13:53 (Europe/Lisbon).
Voltar

Titulo Estágio

LLMs in domain-specific Software Certification: impact, cost vs. benefits and challenges

Áreas de especialidade

Engenharia de Software

Local do Estágio

Coimbra

Enquadramento

Software certification for mission critical applications requires an extensive set of procedures to comply with domain relevant standandards, namely DO178C (for aerospace) and ISO/IEC 62304 (for medical device software), among others. These are detailed, comprehensive, error-prone and tedious activities, where mistakes can be very costly, both economic and in human safety.

The recent dissemination of Large Language Models (LLMs) like GPT-4 can significantly aid in the process of software certification, as these tools can assist in automating, enhancing, and streamlining various aspects of the certification lifecycle.

For example, if a software development team is working on a medical device application that must comply with EU Medical Device Regulation 745/2007, an LLM can potentially help in:

Generate Documentation: create and maintain all necessary regulatory documents.

Compliance Checks: automatically check the code and documentation against the regulations.

Code Reviews: perform regular code reviews to ensure adherence to best practices.

Test Cases: generate extensive test cases to cover all functionality and edge cases.

Risk Management: identify potential risks and provide mitigation strategies.

Training: train the development team on specific EU MDR requirements and best practices.

However, little is known today on how these goals can be actually achieved, much less with the high levels of quality that mission critical certification requires.

Objetivo

The goal of the current project is to study how can LLMs be used to support Software Certification, identify potential areas of impact, its cost/benefit, challenges and develop a test case to assess the most promising contributions that this technology can provide.

Plano de Trabalhos - Semestre 1

• Literature research and state-of-art.
• Study of relevant regulations.
• Analysis and problem statement.
• Identification of most promising research line: exploratory spikes.
• Establishment of a research roadmap for the second semester.
• Intermediate defense thesis elaboration and defense.

Plano de Trabalhos - Semestre 2

The work strategy for the second semester follows an agile approach to exploratory research according to the plan established in the first semester: a series of iterations where the results from the previous iteration inform the next one. Since this work is exploratory and relies heavily on successive outcomes, rather than a traditional plan-based approach, we cannot predefine specific tasks for the second semester. Instead, we focus on meta-tasks for each iteration:

1. Reflect on the outcomes of the previous iteration.
2. Determine the goal(s) for the next iteration, considering newly acquired knowledge and overall project objectives.
3. Create a working plan for the iteration and gather necessary resources.
4. Take action.
5. Document the knowledge and insights gained (to inform subsequent iterations).
6. Repeat steps 1-5 every two to three weeks.

Condições

-

Observações

• Data to be used: existing project(s) documentation (including requirements, standards, procedures and other evidence); existing project delivery metrics.

• Validation of the application: the goal of this research is to understand the impact of LLMs at several stages of the development of certified software. Therefore, the validation of results must include a comparative analysis of metrics with and without the use of this technology.

• Prompt strategies such as Zero-Shot Prompting, One-Shot Prompting, Few-Shot Prompting and Chain-of-Thought Prompting should be compared in achieving the desired outputs. Following these tests, Retrieval Augmented Generation and Agents-based reasoning should be used in the creation of autonomous and guided (human assisted) workflows. Nevertheless, the relevance of the research derives from its domain-specific context and application.

Orientador

Simão Ponce de Leão Policarpo Nogueira
simao.nogueira@criticalsoftware.com 📩