Atribuidos 2022 2023

DEI - FCTUC
Gerado a 2024-12-12 13:33:12 (Europe/Lisbon).
Voltar

Titulo Estágio

A Federated Operator For Kubernetes

Áreas de especialidade

Engenharia de Software

Local do Estágio

DEI

Enquadramento

### Problem
With the massive adoption of the containerization technology among the community, Kubernetes, the container-centric orchestrator, has become the de facto standard to deploy and operate containerized applications. Kubernetes multi-cluster architectures are becoming more popular among SaaS providers due to their ability to increase service availability even under major disasters (cloud provider outage), conform to data residency demands from customers, and also to take advantage of specific cloud provider capabilities avoiding vendor lock-in. However, multi-cluster setups come with extra complexity when it comes to orchestrate services running on top of them, specially on a multi-cloud environment. Kubernetes typically focuses on hardware metrics and limits such as CPU or RAM to trigger scaling in/out events or to schedule a given service among the cluster nodes. It does not, however, account for important factors such as network congestion, multi-cluster network topology, cloud provider external services degradation, etc. Under a multi-cluster scenario, Kubernetes lacks the notion of federated cluster when it comes to taking integrated decisions to increase services performance, reliability, and availability. Having well-defined SLOs can help engineering teams in their operational processes and they can also be used to trigger self-remediation actions, specially at a federated level where cluster conditions may vary a lot, i.e., heterogeneous hardware, etc.

### Motivation
Performing manual management tasks in a multi-cluster environment such as monitoring the system and taking decisions based on that infrastructure metrics is complex, expensive, and can lead to delayed onset response. To fully adopt federated clusters that can be seen as a singleton, there is the need to ensure that applications respect SLOs and to do so, there needs to be a control plane that is able to provide such capabilities to remove the need for manual intervention and improve multi-cluster systems dependability. Current go-to-market tools and research are already providing interesting tools such as Kubefed, or Cluster API to manage multi-cluster environments. These APIs act as API Gateways for every member cluster and manage federated deployments. So far, they are unable to adapt to the underlying cluster conditions and support intelligent control mechanisms capable of providing self-healing actions capable of reacting to changes and anticipating SLO violations. For example, if an application is about to exceed its response time SLO, a controller within the Federated Operator might launch the least expensive Kubernetes Node instances in one of the available clusters.
Furthermore, having a Federated Operator that lives within the Kubernetes ecosystem makes it easier for developers to extend it due to its easiness of integration with the Kubernetes API that is shared across the board in a multi-cluster environment.

Objetivo

The ultimate goal of this work is to be able to implement an extendable Federated Operator, to control multi-cluster configurations that are driven by SLO specifications. However, within the scope of this work, the idea is to create the Federated Operator and the infrastructure that will allow the later addition of policies.

Plano de Trabalhos - Semestre 1

- Learn Kubernetes and study the current state of the art (3 months).
- Define system requirements and architecture (1 month).
- Write intermediate report (1 month).

Plano de Trabalhos - Semestre 2

- Implement the Federated Operator (3 months).
- Perform Tests (1 month).
- Define a simple policy for the Federated Operator (1 month).
- Write final report (1 month).

Condições

The work should take place at the Centre for Informatics and Systems of the University of Coimbra (CISUC) in the Software and Systems Engineering Group at the Department of Informatics Engineering of the University of Coimbra.

Orientador

Filipe Araújo and Miguel Guerreiro
filipius@uc.pt 📩