Titulo Estágio
An Exploration of Diffusion Models for Math Word Problem Generation
Áreas de especialidade
Sistemas Inteligentes
Local do Estágio
DEI / CISUC
Enquadramento
A major component of mathematics education is the ability to solve word problems, which not only assess students’ numerical skills but also their reading comprehension and problem- solving abilities. Moreover, manually crafting new word problems is a time-consuming and labor-intensive task, imposing an undue burden on already overextended math teachers. As a result, recent years have witnessed a modest increase in research efforts focused on generating diverse and high-quality math word problems. However, the field of math word problem generation remains relatively underexplored, warranting further investigation.
Simultaneously, generative models based on deep learning techniques have shown great promise in generating various types of content, including images and natural language. Among these models, Generative Diffusion Models (GDMs) have most recently emerged as a powerful approach for generating content with a high degree of realism and diversity. In a nutshell, GDMs work by simula
Objetivo
The primary goal of this dissertation is to explore the potential of Generative Diffusion Models for generating math word problems with diverse and realistic content. To achieve this, the study should address the following objectives:
1. Review the current state of the art in math word problem generation, including an analysis of existing text generation techniques, with a focus on GDMs and their application to natural language generation.
2. Develop a novel GDM architecture specifically designed for generating math word problems, taking into consideration the unique challenges associated with the task, such as the integration of natural language processing and numerical reasoning.
3. Train and evaluate the proposed GDM architecture on a large-scale dataset of math word problems, documenting the quality, diversity, and realism of the generated problems using both quantitative and qualitative metrics.
Plano de Trabalhos - Semestre 1
- Literature Review (NLP, GDMs, Math Word Problem Generation).
- Identification and familiarisation with useful tools.
- Identification of data to use for training and validation of the models.
- Writing of the dissertation proposal.
Plano de Trabalhos - Semestre 2
- Implementation of a GDM framework specifically for Math Word Problems
- Experimentation with text generation
- Experimentation with Math Word Problem generation.
- Evaluation.
- Writing of the MSc dissertation.
- Writing of a scientific paper.
Condições
O local de trabalho será num laboratório do CISUC, no DEI, onde haverá um acompanhamento regular por parte do orientador.
O trabalho será integrado num projeto de investigação em co-promoção, no âmbito do qual o estudante se poderá candidatar a uma bolsa de investigação para licenciado, durante um período entre 6 a 12 meses, no valor de 875€ / mês.
Observações
Durante a fase de candidatura, dúvidas relacionadas com esta proposta, nomeadamente acerca dos objectivos e condições, devem ser esclarecidas com os orientadores, através de e-mail (hroliv@dei.uc.pt) ou de uma reunião, a marcar após um contacto por e-mail.
Orientador
Hugo Gonçalo Oliveira e Marcio Lima Inácio
hroliv@dei.uc.pt 📩