Propostas Submetidas

Gerado a 2025-07-07 03:05:37 (Europe/Lisbon).

Titulo Estágio

Towards Secure Patch Generation: Integrating Static Analysis with GenAI Models

Local do Estágio

DEI

Enquadramento

As software systems grow in complexity and exposure, the demand for timely and effective vulnerability remediation has become critical. Patch generation—i.e., the process of automatically or semi-automatically fixing vulnerabilities in code—has been a longstanding goal in secure software engineering. While traditional techniques rely heavily on static analysis tools to detect and sometimes suggest code corrections, these tools often struggle to propose context-aware fixes that align with the programmer’s intent.
With the advent of Generative AI (GenAI) models such as Codex, Code LLaMA, or StarCoder, it is now possible to synthesize source code from natural language or code-based prompts. These models have shown promise in tasks like code completion, translation, and even bug fixing. However, without integration with formal static analysis, generated patches may be insecure, incomplete, or introduce regressions.
This thesis explores a hybrid approach: integrating static analysis tools (e.g., Snyk, Semgrep, or CodeQL) with GenAI-based code generation to guide and verify secure patch creation. The core idea is to use vulnerability reports and code contexts as prompts for GenAI models to generate candidate patches, and then apply static analysis to assess whether these patches correctly address the vulnerability without introducing new issues.
This integrated framework aims to improve trust in GenAI-assisted patching by ensuring that generated code complies with established security rules and can be validated using industry-standard tools. The work will contribute toward secure-by-design development pipelines and support DevSecOps efforts to embed security earlier in the software lifecycle.

Objetivo

This thesis pursues the following main objectives:
1. Analysis of Vulnerability Patterns and Patch Characteristics
Study common classes of software vulnerabilities (e.g., CWE list) and identify how they are typically patched in practice.
2. Design of an Integrated GenAI + Static Analysis Framework
Architect a workflow that receives a vulnerability report and code snippet, uses GenAI to propose a patch, and evaluates the patch using static analysis.
3. Prompt Engineering and Model Fine-tuning (Optional)
Explore and define effective prompt templates or optionally fine-tune GenAI models to generate semantically valid and secure patches.
4. Static Analysis Integration
Interface with static analysis tools such as Snyk, Semgrep, or CodeQL to automatically check candidate patches for correctness and absence of known vulnerability patterns.
5. Evaluation on Real-World Vulnerabilities
Build a dataset of real-world vulnerabilities with known fixes (e.g., from GitHub Security Advisories or Snyk DB), and evaluate the effectiveness of the framework in generating secure and valid patches.
6. Contribution to Secure Patch Automation
Deliver a replicable methodology and prototype toolchain for trustworthy vulnerability remediation using GenAI models combined with static analysis verification.

Plano de Trabalhos - Semestre 1

Literature Review
Study the state of the art in automatic patch generation, generative AI in software engineering, and static analysis tools and frameworks.
[13/10/2025 to 09/11/2025] Vulnerability Analysis and Dataset Collection
Analyze a selection of vulnerability databases (e.g., CVE, Snyk, GitHub Advisories) and collect examples with both vulnerable code and verified patches.
[10/11/2025 to 07/12/2025] Static Analysis Framework Setup
Select and set up tools such as Snyk CLI, Semgrep, or CodeQL to evaluate code snippets and patches for security properties.
[08/12/2025 to --/01/2026] Thesis Proposal Writing
Draft the thesis proposal, including motivation, objectives, methodology, and experimental setup.

Plano de Trabalhos - Semestre 2

GenAI Patch Generation Pipeline
Implement a tool that takes vulnerability context and prompts a GenAI model (e.g., GPT-4, Code LLaMA) to generate patch candidates.
[02/03/2026 to 19/04/2026] Integration with Static Analysis Tools
Build an automated evaluation pipeline where generated patches are verified using static analysis tools for absence of known issues.
[20/04/2026 to 10/05/2026] Experimental Evaluation
Apply the system on a curated vulnerability dataset (https://vulnerabilitydataset.dei.uc.pt). Measure patch validity, security compliance, and compare different models and prompting strategies.
[11/05/2026 to --/06/2026] Thesis Writing
Write the final thesis, discussing results, limitations, future work, and implications for secure software engineering practices.

Condições

This work occurs within the context of the AI-SSD (2024.07660.IACDC) project and depending on the evolution of the internship a studentship may be available to support the development of the work. The work is to be executed at the laboratories of the CISUC’s Software and Systems Engineering (SSE) Group and Cyber Security Laboratory (CS-Lab).

Orientador

Joao Campos
jrcampos@dei.uc.pt 📩