Propostas Submetidas

DEI - FCTUC
Gerado a 2024-07-17 09:39:42 (Europe/Lisbon).
Voltar

Titulo Estágio

LLM fine-tuning for Verification & Validation tasks

Áreas de especialidade

Engenharia de Software

Engenharia de Software

Local do Estágio

Critical Software

Enquadramento

Large Language Models (LLMs) have shown impressive capabilities in industrial applications. Often, companies seek to tailor these LLMs for specific use-cases and applications to fine-tune them for better performance. However, LLMs are large by design and require a large number of GPUs to be fine-tuned. Recently, different approaches have been proposed for fine-tuning these models using consumer-grade hardware, such as Paramenter Efficient Fine-tuning (PEFT), quantization of model weights (Quip), or Low-rank Adaptation (Lora).

In this internship we want to address a high-level task from the Verification & Validation (V&V) development cycle. Namely, generating Low Level Requirements (LLR) based on provided High Level Requirements

(HLR) information.

At Critical Software we have available different datasets that map HLR into LLR that could be leveraged for fine-tuning a model for this specific task. These consist of MS Excel files that were extracted and used in real projects from different domains containing direct mappings from HLR to LLR.

Objetivo

The main objective of this project is to address fine-tuning of LLMs for the specific task of generating LLR from HLR, having into account that access to GPU servers is limited. To accomplish this the state-of-the-art methods for fine-tuning must be investigated, especially those that target consumer-grade hardware.

The available datasets must be curated and pre-processed as to allow its usability in the context of model fine-tuning. Finally, different methods and target models should be experimented with, starting with small language models and gradually increasing the size as far as the available computing power allows.

Plano de Trabalhos - Semestre 1

- Literature review on LLM fine-tuning methods

- Research and review of evaluation frameworks (such as, and not limited to, DeepEval or Ragas)

- Curate and preprocess the available datasets.

- Write a thesis proposal

Plano de Trabalhos - Semestre 2

- Define the experimentation methodology and the techniques to address, having into account the evaluation intra- and cross-domain.

- Fine-tune and evaluate different models on the available datasets.

- Analyse the experimentation results and provide guidelines for future development.

- Final project report redaction.

Condições

A computer and work post will be provided. Access to a workstation with two GPUs will also be available, possibly extra cloud resources as well, according to the identified needs during the internship.

The student will be offered an internship contract for the entire duration proportional to the availability during both semesters.

Observações

Nota: Orientador da instituição Hugo Gonçalo Oliveira

Orientador

Rui Lopes
rui.lopes@criticalsoftware.com 📩