Propostas Submetidos

DEI - FCTUC
Gerado a 2024-05-17 06:50:03 (Europe/Lisbon).
Voltar

Titulo Estágio

Generation of humorous punning headlines via Genetic Algorithms

Áreas de especialidade

Sistemas Inteligentes

Engenharia de Software

Local do Estágio

DEI / CISUC

Enquadramento

Verbal humor is a general phenomenon that is strongly related to complex linguistic knowledge and fluency. Therefore, researchers have been working over the last decades to build machines that are able not only to recognize but also to create funny texts. One of such examples is a PhD research that is currently occurring in CISUC about the detection and generation of punning humor for the Portuguese Language.

Recently, a novel method called GALMET (Genetic Algorithm using Language Models for Evolving Text) was proposed to create funny headlines for news articles by modifying their pre-existent non-humorous title. In general, the authors use a regression model to estimate the level of funniness of a text as a fitness function for a Genetic Algorithm (GA) that performs different kinds of mutation operations — such as word insertions, substitutions, or deletions — to create a novel funny headline. This should be an interesting approach to adapt to the pun generation scenario, with plenty of paths for research, such as the inclusion of explicit phonetic transcription into the process. Additionally, this approach does not address the problem that the new witty headline might lose its original relation to the main article, making its application more difficult in real scenarios, indicating further research that can be pursued.

Moreover, with the current trend of creating Responsible and Explainable Artificial Intelligence models, this work could be expanded in the sense of including Machine Learning (ML) Explainability methods, specifically, for explaining why the performed adaptations are expected to produce humor.

Objetivo

This thesis project aims at adapting and applying GALMET-like methods to a pun generation scenario by exploiting punning humor corpora to create the fitness function model or including explicit phonetic information to the editing process. This may take advantage of models and corpora that have already been trained during the aforementioned PhD project.

As in GALMET, the fitness function can be based on a humor regression model, in this case trained in existing datasets built from satirical news from Inimigo Público or puns from O Sagrado Caderno das Piadas Secas. Some examples (in Portuguese) include:
* Hoje passei por uma família de cães na rua. Foi ali em Family-cão.
* O que é um fuinho? É um buaquinho na pauede.
* Automobilista com carro avariado na berma foi confundido com colete amarelo e bloqueou A1 nos dois sentidos

As an extra objective, one could explore adding contextual constraints to such a system, so that the newly generated humorous headline still retains its semantic relations with the news article. This could be done in various ways by exploring multiple Semantic Textual Similarity (STS) scoring approaches into the GA fitness function.

Finally, the resulting method should be evaluated according to general practices in the Humor Generation literature. This involves the manual evaluation by humans according to criteria such as funniness, fluency, grammaticality, and others.

Briefly. the objectives are the following:
1. Implement a Genetic Algorithm to generate humorous headlines in Portuguese;
2. Constrain the generation process so that the results are to be considered puns;
3. Study approaches for avoiding out-of-context headline generation;
4. Study how to integrate ML Explainability methods in humor generation;
5. Take conclusions on the success of the approach, considering the type and quality of the resulting text.

Plano de Trabalhos - Semestre 1

- Literature review — Humor Generation, Genetic Algorithms, Punning Humor
- Data gathering — News headlines and articles, punning texts
- Familiarization with Text Classification and Adaptation
- Master’s thesis project writing and presentation

Plano de Trabalhos - Semestre 2

- Adaptation of fitness function to punning humor
- Adaptation of mutation operations to punning humor
- Experimentation and Evaluation
- Exploration of approaches for avoiding out-of-context generations or ML Explainability methods on the resulting generations
- Writing of the Master’s thesis

Condições

The workplace will be in a CISUC laboratory, where there will be regular communication with the supervisors.

Orientador

Hugo Gonçalo Oliveira e Marcio Lima Inácio
hroliv@dei.uc.pt 📩