Propostas atribuídas

Gerado a 2024-07-17 11:23:37 (Europe/Lisbon).

Titulo Estágio

Automatic Summarization Framework

Áreas de especialidade

Sistemas Inteligentes

Local do Estágio

Talkdesk / TDX - Instituto Pedro Nunes, Edifício D, R. Pedro Nunes, 3030-199 Coimbra


At Talkdesk we are reimagining how people experience contact centers, helping our customers create long lasting and meaningful relations with their customers. These relations are made of individual interactions where communication is supported on the language we all learn since the first day we are born. The language that we humans learn to master over the years and use almost without thinking about it. That's how we express our thoughts, feelings and needs when we reach out to a contact center. The capacity to make sense of the message that is being conveyed and generate an appropriate response has been reserved to humans alone, but we are building the next generation of solutions that will mimic our ability to understand language, ensuring that the customer needs are well understood and efficiently handled. We are a growing team of curious and talented people that is focused on leveraging the power of Natural Language Processing to deliver added value to our customers whenever possible.

With the growing amount of unstructured information that is available today, knowledge workers are forced to spend most of their time reading and analyzing texts in order to collect the information they need. But, for many organizations, coping with the amount of data that is available has become an impossible task, either because they cannot afford to dedicate more employees for such task, or because the task has become itself impossible to be handled by humans.

In a contact center, the interactions are typically handled using natural language and are processed at a scale of millions per day. In order to take actionable insights from these interactions we need first to be able to distinguish the relevant highlights from all non-relevant information that is conveyed in between. This is typically a high consuming task conducted by agents, for instance when a call ends, so that a registry of what was discussed in the call can be stored for later analysis.

The challenge we face is to use natural language processing and machine learning techniques to identify the relevant content in a dialogue, so that we can automatically generate a coherent and objective summary of that dialogue.

TDX is Talkdesk’s Global Innovation Lab, located in Coimbra. It was created to fast-track key technology initiatives for Talkdesk, and accelerate technological responses to business needs. In TDX, we explore emerging technologies to drive Talkdesk’s future product capabilities and create additional impact for our customers. In this internship, selected applicants will have the chance to be part of a multicultural team that is accomplishing these breakthrough developments, by making the impossible possible.


The main objective of this internship is the development of an Automatic Summarization Framework based on state of the art approaches, to support the implementation of solutions that require the generation of a summary for the transcription of a dialogue, or group of dialogues (e.g. take the transcription of a call handled by an agent in a contact center and generate a textual summary of what has been discussed on that call).

The development of such a framework will require the completion of the following goals:
- Analysis of the state of the art, available technologies and existing competitors.
- Study and comparison of adequate approaches.
- Implementation of a solution for automatic summarization, including support for text ingestion, text analysis and summary generation, as well as adequate APIs.
- Experimentation and fine-tuning of the implemented solution.

By the end of the internship, the intern should have gained experience in the development of solutions at an enterprise level, including processes and expected deliverables. More specifically, the intern will have acquired relevant knowledge about the design, implementation and experimentation of an Automatic Summarization Framework, including applicable and relevant approaches.

Plano de Trabalhos - Semestre 1

The plan for the 1st Semester consists in:
- State of the Art, Technological Survey and Competitor Analysis [September - October]
- Requirement Analysis [November]
- Approach Analysis and Selection [November - December]
- Specification and Design [December]
- Thesis Proposal Writing [January - February]

Plano de Trabalhos - Semestre 2

The plan for the 2nd Semester consists in:
- Solution Implementation (Agile) [February - June]
- Solution Experimentation [June]
- Thesis Writing [June - July]


The intern will have an ascribed experienced TDX mentor, with whom weekly progress meetings with be held. An onboarding plan, a final demo presentation (by the end of the internship), full participation in regular team meetings, office and company events will be also provided. Specifically, during the on-boarding period, the intern will have the opportunity to meet Talkdesk`s technical and non-technical teams! Additionally, the intern will have access to the same IT equipment and tools as any Talkdesk employee (such as a computer and personal email).

Talkdesk looks after each person and makes a point of investing in their morale and well-being.
Our offices, located in Lisbon, Porto and Coimbra, have the shared benefits for each office, ranging from complimentary drinks, bread, cookies, fruit and all kind of other eatables for snacks during the day. We have a flexible and familiar work environment.

A formal internship agreement will be signed by all involved parties, and a monthly scholarship will be paid by Talkdesk - 600€/month (net).


The internship experience is targeted to be as enriching as possible, as a real and deep learning environment will be provided. Personal development will also exist, as the trainee will integrate TDX multicultural team, where different nationalities and cultures coexist in a daily basis, and english is used as primary working language.


Pedro Miguel de Almeida Verruma 📩