Titulo Estágio
Building Microservices using Kubernetes and Docker
Áreas de especialidade
Engenharia de Software
Sistemas Inteligentes
Local do Estágio
DEI
Enquadramento
The leading consulting company McKinsey estimates that there will be a shortage of data scientists to enable organizations to explore the full potential of big data. By 2018, the United States alone will face a shortage of 140,000 to 190,000 professionals with strong analytical skills with the know-how to analyze big data to make effective decisions. This shortage will be more dramatic in Portugal since, in contrast to US universities that provide Data Science degrees for several years (e.g., at Berkeley and Carnegie
Mellon University), Portuguese universities are just making the first steps.
This shortage of professionals cannot be mitigated easily, since training students to become data scientists requires time and resources to teach skills from diverse knowledge areas such as Computer Science, Statistics, Business, and Data Visualization.
Hence, the objective of the FCT DataScience4NP project is to explore the use of visual programming paradigms to enable non-programmers to be part of the Data Science workforce. More specifically, the objective of the DataScience4NP project is to build Cloud Native Applications (CNA) for Data Science using microservices.
Objetivo
This thesis will develop one of the components of the DataScience4NP platform. Namely, the main goal is to use Kubernetes and Docker container technologies to provide machine-learning algorithms as a service (Analytics-as-a-Service (AaaS)).
Docker will be used to run encapsulated application containers in a relatively isolated but lightweight operating environment. Each analytic service will be implemented using a container. Kubernetes, a powerful management system developed by Google, will be used for managing the containerized applications. Docker and Kubernetes will offer to non-programmer data scientists an easy access to analytics expertise accessible as microservices.
Specific Objectives:
• To encapsulate machine learning algorithms using microservices
• To model microservices using semantic service descriptions.
• To develop a module to search for microservices using semantics
Technologies:
• Kubernetes and Docker
• Machine learning libraries such as Scikit-Learn
• Cloud platform (OpenStack, Amazon AWS, etc.)
• Linked USDL and LSS USDL
Plano de Trabalhos - Semestre 1
- Review of the state of the art and technologies on container technologies and machine learning algorithms
- Requirement analysis (including both functional and non-function requirements)
- System architecture
- Writing of the preliminary thesis
Plano de Trabalhos - Semestre 2
- System development
- System testing
- Writing of the final thesis
Condições
The student might receive a scholarship from the FCT DataScience4NP project (745€ / month).
Orientador
1) Jorge Cardoso, 2) Rui Pedro Paiva, 3) Filipe Araújo
ruipedro@dei.uc.pt 📩