Propostas Submetidas

DEI - FCTUC
Gerado a 2024-05-11 04:06:34 (Europe/Lisbon).
Voltar

Titulo Estágio

Parallelization of SPARQL Queries on Triple Stores for Semantic Data

Áreas de especialidade

Engenharia de Software

Sistemas de Informação

Local do Estágio

SSE-DEI

Enquadramento

Semantic Web is considered the next step of evolution of the Internet. The Web, as we know it today, was mostly written for humans to read. The Semantic Web is intended for both humans and machines to read and act upon the information. Tasks that are done manually today, may be automatized by intelligent agents. Practical usages of the Semantic Web include recommendation engines, advanced searches, system integration…
One of the main issues with applying Semantic Web today is the weak performance of queries. Specially when compared to traditional Relational Databases, the performance of queries to triple-stores is much slower, decreasing the usefulness in services where users expect an immediate response.
In the proposed work, the student will work towards improving the performance of this systems, having two ideas as starting points:
1) A new generation of NoSQL databases is emerging (such as MongoDB, redis, Kyoto cabinet, neo4j, etc). Recent studies, mainly in industry, show these da

Objetivo

The main objective of this project is to improve the performance of triple store queries. Two main artifacts should be the result of this work:
1) A triple-store engine that leverages both the speed and structure of NoSQL databases and the ability to parallelize queries.
2) A performance study benchmarking the proposed triple-store against existing state-of-the-art triple store databases (AllegroGraph, Virtuoso, 4 Store).

Plano de Trabalhos - Semestre 1

17 Sep - 31 Oct
Review of the most recent approaches to this problem in the state of the art.
1 Nov - 31 Nov
Approach - Definition of the requirements and work plan.
1 Dez - 31 Dez
Selection of a NoSQL database and implementation of the backend.
1 Jan - 28 Jan
Writing and reviewing of the first semester report.

Plano de Trabalhos - Semestre 2

15 Feb - 8 Mar
Creation of a SPARQL interface for the triple store.
9 Mar - 31 Mar
Parallelization of query clauses.
1 Apr - 30 Apr
Parallelization of graph subsets.
1 May - 31 May
Benchmark of the triple store against state-of-the-art engines.
1 Jun - 28 Jun
Writing and reviewing of the dissertation.

Condições

The proposed work plan will be performed in the Software and Systems Engineering Group of CISUC, where the student will be given access to required hardware.

This is not a paid internship.

Observações

We are looking for students who understand the importance of the Semantic Web and are familiar with RDF, SPARQL and have used a triple-store before.
Students should also be familiar with concurrent and parallel programming.
NoSQL experience will also give you an advantage on this project.

Orientador

Alcides Fonseca
amaf@dei.uc.pt 📩