Propostas sem aluno atribuído - Setembro de 2014

DEI - FCTUC
Gerado a 2024-05-02 13:24:51 (Europe/Lisbon).
Voltar

Titulo Estágio

Reconstructing Service Networks from Crunchbase.com

Áreas de especialidade

Sistemas de Informação

Local do Estágio

DEI-FCTUC

Enquadramento

The vision statement “Locating Your Next Strategic Opportunity”, published by Harvard Business Review in March 2011, clearly shows some of the innovations that can be explored by using service networks. For example, service networks can be used to identify niche markets, which are fertile to explore new ideas. Unfortunately, no attempts have been made to reconstruct such large-scale networks.

This project will tackle this limitation and will reconstruct large-scale service networks from the richness of information available on web registries. For example, CrunchBase.com contains the largest corpus of structured data for technological industries and includes profiles for 7,249 service providers, 224,392 companies, 269,171 key people, and 14,519 financial organizations. Other examples of registries include ProgrammableWeb (PW), venturebeat.com, and COMPUSTAT financial databases.

Objetivo

Develop a crawling engine to reconstruct service networks from web data sources. Obtaining a good and large data set is one of the biggest challenges of this project. Fortunately, recent -- and still fairly unexplored initiatives--, such as linked data, open (government) data, and crowdsourced data (e.g., Crunchbase), provide valuable new data sources that are made accessible online, published in an open machine readable format, and licensed to allow re-use.

We will follow a two-step process. First, we will use comprehensive web registries to create “skeleton” networks, a basic structure with minimal information on services and relationships. The goal is to bootstrap service networks using registries available under open licenses. Afterwards, we will query additional web data sources to extend and enrich this structure. This extension is needed since registries may not contain all the relevant services of an ecosystem.

Plano de Trabalhos - Semestre 1

(a) Study the main technologies to be used by the project (e.g., Crunchbase.com, JSON, Web API, Linked USDL, Linked Data, OSSR, and RDFS) (Setembro a Dezembro de 2014).

(b) Develop the first prototype of a crawling engine to reconstruct service networks (Minimum Viable Product) (Novembro de 2014 a Fevereiro de 2015).

Plano de Trabalhos - Semestre 2

(c) Testes the crawling engine with real data (Março de 2015).

(d) Develop the final system prototype of a crawling engine for a service networks (Março de 2014 a Jun de 2015).

(e) Documentation (running task from begin to end) (Setembro 2014 a Jun de 2015).

(f) Report writing and defense (Abril a Jun de 2015).

Condições

Este trabalho será realizado no DEI/Universidade de Coimbra.
Mestrado não remunerado.

Observações

If the results of the project are complete, sound, and innovative, it will be possible to write a final research paper to be published in a scientific outlet.

Orientador

Jorge Cardoso
jcardoso@dei.uc.pt 📩