Propostas Submetidas

Gerado a 2025-05-24 03:33:20 (Europe/Lisbon).

Voltar

Titulo Estágio

Optimal continuous testing strategies

Áreas de especialidade

Engenharia de Software

Sistemas Inteligentes

Local do Estágio

DEI/UC

Enquadramento

Continuous testing is an important component of Software Engineering, especially so when project becomes bigger and number of contributors increases.

Comprehensive testing of all tests after each new contribution is roughly of quadratic complexity with respect to the size of the project (linear in number of test and in number of contributions).

This master thesis focuses on strategies to keep regression testing within reasonable linear complexity. And also, to provide partial-coverage yet reliable status within sub-linear time (for pre-commit test).

One approach to quadratic complexity would be to maintain multiple small independent projects.
This has merits but also some drawback. It is a handicap when trying to push changes that are widespread by nature: change of API with non-backward compatible changes (example: Python 2 to 3), compiler upgrade (C++11). It also makes dependency management more complex for both automatic tools and humans. See ACM article "Why Google Stores Billions of Lines of Code in a Single Repository" for a rationale why multiple, small projects is a trade-off. BNP Paribas Quantitative Research is also moving toward a single repository.

Another approach we'd like this master thesis to focus on is to leverage on data coming from source control and testing history.
- Using co-occurrence of regressions, we can automatically select small test that gives a good estimate of the whole test set.
- Using co-occurrence of regression with code change in the source control, we can further refine the test picking strategy depending on the code change.

Automatically, the analysis should re-discover the underlying organisation of the code and tests, identify the parts that are independent and those in wider use. Performance of testing is expected to be equal or better than when splitting in multiple projects. To be able to compare our understanding of the code/test dependencies with the one inferred from data might bring insight and good visualisation would be a plus.

Current test grid at BNP Paribas Quantitative Research uses a few hundred core grid in Paris and London. Effective testing would allow to better scale, increase responsiveness as project grows, within infrastructure constraint. Good results would be adopted in live production.

Objetivo

1. Review the state of the art on the subject (include possibly relevant articles from other fields)
2. Based on the review, implement an algorithm for test picking strategy.
3. Validate its effectiveness in a controlled environment (artificially generated data)
4. Back-test its effectiveness on real projects historical data, quantify benefit at BNP Paribas Quantitative Reaserch scale.
5. (optional) Participate to go live in production at BNP Paribas

Plano de Trabalhos - Semestre 1

- review of the state of the art,
- small proofs of concept,
- experiments

Plano de Trabalhos - Semestre 2

- complete Prototype,
- validation on real use cases,
- write the final thesis report

Condições

Depending on results (and pending approval):
The student might be invited to Paris or London to present the work done to BNP Paribas Quantitative Research Team management.

Orientador

Claude Cochet
claude.cochet@bnpparibas.com 📩