Titulo Estágio

Process Mining in Software Repositories

Áreas de especialidade

Engenharia de Software

Sistemas Inteligentes

Local do Estágio



Process mining is a family of techniques in the field of process management that support the analysis of processes based on event logs. During process mining, specialized data mining algorithms are applied to event log data in order to identify trends, patterns and details contained in event logs recorded by an information system. Process mining aims to improve process efficiency and understanding of processes.


The goal of this thesis is to apply process mining techniques to software project repositories (e.g. Github, subversion), in order to reverse engineer the processes used by developers and fully characterize the actual workflows adopted by the teams.

Plano de Trabalhos - Semestre 1

The first semester is devoted to gather information on the state-of-art of this topic and a comprehensive characterization of the analysis system that will apply the process mining techniques to specific project repositories. 'PROM' will be the tool used to perform the analysis of the data, so most effort will be devoted to understand the project repositories' APIs in order to integrate with the analysis tool.

Some effort will be devoted to the characterization of the outputs, namely a graphical representation of the software development processes, e.g. using BPMN.

Another output of this phase will be the Master Thesis intermediate report.

Plano de Trabalhos - Semestre 2

The core activity of the second semester will be the study and analysis of the selected repositories, as well as performing incremental improvement of the analysis tool in face of the observations performed. This is an exploratory activity and therefore it cannot be fully defined beforehand.

A scientific paper is expected to be written with the major results observed.


The student will be provided with close mentoring from the advisor, a machine, a working space and access to the University resources (internet, on-line libraries access).


Mário Alberto da Costa Zenha Rela 📩