Intelligent Document Validation Using Computer Vision and Natural Language Processing

Áreas de especialidade

Engenharia de Software

Sistemas Inteligentes

Local do Estágio



Most processes in organisations still have a huge number of documents that must be submitted by customers or users. Like applying for a visa or opening a new bank account. Most of these are official documents and are needed for validating the applicant/customer identity and context, which are then manually validated by an institution’s employee. This takes time and is prone to errors, even if the error rate is small, the damage caused by it can be big. CRITICAL Software has currently in development a solution that addresses this need, using techniques from Computer Vision, Machine Learning, and Natural Language Processing. The IDV system validates if a submitted document is what is supposed to be, and also extracts useful information out of it, making the human's work of document validation and extraction much more efficient and less error prone.

In this internship proposal, the goal is to integrate the current IDV team and help develop new models to detect and extract structured information from the documents.


The main goal of this internship is to create and integrate into the current IDV platform, new solutions and Machine Learning models, that improve the performance of the currently deployed solutions. Namely on the classification of document types, on the detection of important structures in documents, and on the extraction of field values. The intern will research and choose possible solutions to implement, however, there are already some internal research on possible techniques to evaluate, such as: automatic data augmentation, transfer learning, deep learning, or neuroevolution.

Plano de Trabalhos - Semestre 1

The internship has the following stages:
- Defining the Scope and Requirements [result: requirements list, months 1-2]
- Reading and Writing the State of the Art [result: state of the art, months 1-4]
- Study the current IDV platform [result: platform description and comparison, months 1-4]
- Creating the Technical Specification [result: technical specification, months 5-6]
- Writing the internship proposal [result: internship proposal, months 2-6]

Plano de Trabalhos - Semestre 2

The second semester comprises the following stages:
- Setting up the Research and Development environments [result: Development Environment, month 7]
- Development of IDV [result: first prototype, months 7-10]
- Testing and Benchmarking [result: second prototype, months 11]
- Writing the internship report [result: internship report, months 11-12]


É fornecido portátil e local de trabalho.


Tiago Baptista 📩