Propostas com aluno Identificado

Gerado a 2025-07-07 00:10:02 (Europe/Lisbon).

Titulo Estágio

Automatic Exploit Generation from Vulnerability Descriptions

Local do Estágio

DEI

Enquadramento

Security testing often lags behind the discovery of new vulnerabilities due to the manual effort required to understand and reproduce each case. Although vulnerability descriptions in databases such as the CVE (Common Vulnerabilities and Exposures) catalog provide textual summaries and references, they are rarely accompanied by ready-to-use test cases or exploit code. This gap delays mitigation and weakens the responsiveness of security validation efforts in development pipelines.
Recent advances in Large Language Models (LLMs) have demonstrated impressive capabilities in translating natural language to executable code, understanding software semantics, and even performing reasoning about program behavior. These capabilities open a new opportunity: automatically generating working exploits from textual vulnerability descriptions. Such automation would significantly reduce the time needed for vulnerability triage and enable proactive testing through automated security pipelines.
This thesis will explore the feasibility of using LLMs to translate natural language vulnerability descriptions into concrete, testable exploits that can be deployed in security validation scenarios.

Objetivo

This thesis has the following main objectives:
• Study the structure and content of real-world vulnerability descriptions (e.g., CVEs).
• Design a methodology for converting these descriptions, combined with source, into structured inputs for code generation.
• Fine-tune or prompt-engineer LLMs to generate proof-of-concept (PoC) exploits targeting known vulnerable code patterns.
• Evaluate the effectiveness and reliability of the generated exploits.

Plano de Trabalhos - Semestre 1

Literature Review
Study CVE structures, exploit generation tools, and LLM capabilities in code generation.
[13/10/2025 to 09/11/2025] Data Curation and Preprocessing
Leverage a in-house dataset of CVEs (https://vulnerabilitydataset.dei.uc.pt) and combine with known exploits; analyze the mapping from text to code.
[10/11/2025 to 07/12/2025] LLM Prompt Design and Testbed Setup
Design and test initial prompts; deploy a secure environment with vulnerable targets.
[08/12/2025 to --/01/2026] Thesis Proposal Writing

Plano de Trabalhos - Semestre 2

Pipeline Implementation
Integrate LLM-based code generation with the exploit validation framework.
[02/03/2026 to 19/04/2026] Experimental Campaign
Evaluate generated exploits across multiple scenarios and models.
[20/04/2026 to 10/05/2026] Result Analysis and Ethical Review
Analyze success rates, process results
[11/05/2026 to --/06/2026] Thesis Writing
Compile all results, discussion, and final documentation.

Condições

This work occurs within the context of the AI-SSD (2024.07660.IACDC) project and depending on the evolution of the internship a studentship may be available to support the development of the work. The work is to be executed at the laboratories of the CISUC’s Software and Systems Engineering (SSE) Group and Cyber Security Laboratory (CS-Lab).

Orientador

Joao Campos
jrcampos@dei.uc.pt 📩