Propostas atribuídas ano letico 2025/2026

DEI - FCTUC
Gerado a 2025-08-31 18:11:49 (Europe/Lisbon).
Voltar

Titulo Estágio

Semantics in place and time

Área Tecnológica

Artificial Intelligence, Ubiquitous Computing

Local do Estágio

DEI

Enquadramento

Understanding the meaning of a place can be seen from a range of perspectives and plenty of dissertations have been done on it since at least ancient Greek philosophers. Its importance becomes even higher today with ubiquity of both location aware devices and written information about places in the Web. Traditionally, both in the Web as well as in navigation devices, places have been organized as Landmarks or Points Of Interest. The notion is that a POI is a place with some interest for some group of people, be it touristic, utilitarian, institutional or emotional. Technically, a POI corresponds to a descriptive name, a pair of geographical coordinates and, optionally, its category. This is, however, very limited information in which regards to understanding the meaning of that place.

Our work on semantics of place can be synthesized in the challenge of inferring, for a given POI, the list of words that best defines it. Of course, this list is ultimately subjective and dependent on perspective. Our approach is to turn this strong adversity into richness: we consider the inference of multiple perspectives for a place.

Objetivo

Continuing previous work in this subject (Alves et al., 2006; 2008; Antunes et al., 2008), we apply Information Extraction techniques to derive that list of words, to which we call the Semantic index. This is materialized in the Kusco platform, which performs a pipeline of tasks to accomplish this goal. It uses external resources such as POS taggers, NER and NP chunking algorithms, Wordnet, OWL ontologies and the Yahoo! APIs for web search. It is currently being extended to seamlessly cope with the Wikipedia, upcoming.org, Flickr and Yellow Pages. And a novel filtering technique is being developed and tested to build a dynamic stopword list that better avoids redundant and noisy data.

In this internship, we shall consider the inclusion of new resources (openmind.org, framenet, conceptnet, twitter, custom rss feeds, etc.) and a more efficient TF-IDF weighting mechanism that would improve considerably the results, along with the stopword filtering.

Plano de Trabalhos - Semestre 1

The tentative plan for this project (semester 1) is the following:
- October 15th – State of the art (1.5 months)
- October 31st - Understanding Kusco + requirements specification (2 months)
- November 30th - Kusco extension using new resources. (1 month)
- December 15th – Experimentation. (1 month)
- January 15th - Journal Paper submission (2 months)
- February 27th - Intermediate report. Plan for new developments on the system of the following semester. Possible topics include Sentiment analysis, Integration of ontologies, New resources, Further experimentation. (1 month)

Plano de Trabalhos - Semestre 2

The tentative plan for this project (semester 2) is the following:
- April 30th - Implementation of new developments (2 months)
- May 31th - Experiments report. Paper submission. (1 month)
- June 30th - MSc thesis delivery. (1 month)

Condições

Strong skills in programming (Java), Web crawling, screen scrapping.

Will to communicate in English with other researchers is also important.

Other interesting skills include Information Extraction techniques, Artificial Intelligence, Ubiquitous Computing

Observações

This project is integrated in current research with Senseable City lab at MIT. Besides the supervisor, the student will become part of a team with two PhD students (one FCTUC, one MIT), two MIT researchers and other collaborators at both sides.

Orientador

Francisco Câmara Pereira
camara@dei.uc.pt 📩