Titulo Estágio
Learning from Crowds
Área Tecnológica
Reconhecimento de Padrões
Local do Estágio
DEI
Enquadramento
Crowd-sourcing platforms like Amazon’s Mechanical Turk (AMT) are gathering an increasing attention from researchers from many different fields. These platforms are rapidly changing the way datasets are built, by providing researchers with a cheap, accessible and considerably fast resource to obtain labeled data, whose quality has been shown to be comparable to data labeled by experts for various different tasks. However, these sources are quite heterogeneous in terms of annotator qualities (i.e. can we trust all annotators equally?), and therefore typically used approaches such as majority voting are not good solutions. For such reasons, these online platforms are not only rapidly changing the way datasets are built, but they are also giving rise to a new learning paradigm: Learning from Crowds.
In the AmILab/CMS, we have been working with latent variable models for supervised learning with data labeled by multiple annotators, which is currently a very hot topic in the Machine Learning community. In our work one of the main contributions is to build predictive models using labels provided by multiple (non-expert) annotators for various tasks related with Urban Mobility.
The role of the student in this project is to work with of such models and applying them in different problems.
TICE.Mobility is a 3-year R&D project with a budget of 6M€ which comprises 27 partners among IT companies, research institutes and technology transfer institutes. The project´s mission involves exploring new, more efficient and comprehensive solutions for urban transportation, through the use of communication and information technologies (CIT) to make it possible to integrate the various available solutions, in an ecological, energy-efficient way with better quality for users, in combination and cooperation with other domestic initiatives.
This project involves collaboration with the Massachusets Institute of Technology (MIT)
Objetivo
Building on other work that is currently being developed in our lab in connection with the project TICE.Mobility, the goal is to implement Machine Learning models that are able to learn from multiple annotators, and test the various implementations on a wide variety of problems using different hyperparameters in order to understand: (1) the differences between the different approaches, (2) the applicability of such approaches to specific problems, and (3) trade-offs in the choice of hyperparameters.
Plano de Trabalhos - Semestre 1
The tentative plan for this project (semester 1) is the following:
- October 15th - State of the art. (1.5 months)
- October 31st - Understanding previous work in the lab. (2 months)
- December 30th - Model implementation, Part I. (2 month)
- January 15th - Preliminary experimentation. (1 months)
- January 31th - Intermediate report. Plan for new variables/extensions to include in the model or different model to implement. (1 month)
Plano de Trabalhos - Semestre 2
The tentative plan for this project (semester 2) is the following:
- March 31th - Model implementation, Part II. (2 months)
- April 30th - Experimentation. (2 months)
- May 31th - Experiments report. Paper submission. (1 month)
- June 30th - MSc thesis delivery. (1 month)
Condições
Strong skills in programming (Java, Python, C/C++).
Other interesting (optional) skills/interests include Machine Learning and Pattern Recognition techniques, and some (basic) knowledge of Statistics.
Will to communicate in English with other researchers is also important.
Observações
The candidate curriculum is required.
This work enables the obtention of a regular BIC grant (see FCT salary table and conditions) subject to open call.
Orientador
Francisco Câmara Pereira e Filipe Rodrigues (CISUC)
camara@dei.uc.pt 📩