Kubernetes
Kubernetes is a mature, open source platform for managing containerized workloads. It provides a unified abstraction layer over infrastructure details and allows easy deployment and management of containers while providing all features needed to run it in an enterprise setting.
VCPS uses Kubernetes as a base for all services it offers. When deploying a new managed service VCPS creates a dedicated customer specific and service specific Kubernetes cluster which will contain the desired service after deployment is complete.
This document aims at providing an overview over the most important components that constitute a Kubernetes deployment and how they work together to create a functional cluster.
Motivation
To understand the motivation for why Kubernetes was created let's take a look at the evolution of how workloads have been deployed (source).
Traditional deployments provision physical infrastructure to run a given workload. In this model any time resource requirements change new physical hardware has to be acquired, prepared and set up. This is generally a comparatively slow process. Workloads running on that infrastructure can only be moderately well isolated by using operating system facilities like different user accounts. If full separation is desired different physical servers have to be used which incurs additional hardware and software costs.
Virtualized deployments improved on traditional deployments by switching to virtual machines for running workloads. In this model the physical infrastructure runs a hypervisor that acts as a resource multiplexer for fully isolated virtual machines. Each virtual machine can be allocated a specific slice of the server's physical resources (in form of virtual CPU cores and RAM). Workloads running in different virtual machines cannot interfere with each other (barring any security issues in the hypervisor). Setup speed and resource utilization is also improved as creating a new virtual machine is generally fast and multiple virtual machines run on the same physical infrastructure. It is not ideal though as each virtual machines has to run its own operating system and provide all necessary runtime dependencies for the workloads.
Container deployments further optimize on the virtualized deployments. Instead of a hypervisor it uses a container runtime. This runtime uses operating system facilities to run lightweight containers. Those containers are isolated from each other like virtual machines but don't run a virtualized operating system. Instead they only provide the runtime environment needed by the containerized workload. This improves utilization of physical resource even further compared to virtualized deployments while maintaining most of its advantages.
Managing separate physical infrastructure for running containers manually is daunting, error prone and impractical when exceeding a certain amount of workloads and physical machines. Kubernetes is built for abstracting and automating away as many details as possible for running containerized workloads in a cluster made up of physical (or virtual) infrastructure. A high level overview over its most important features can be found here.
Components
A Kubernetes cluster is made up of several different components that work together to provide its functionality. The central element is the Kubernetes API provided by the control plane. It is the entrypoint for cluster users and worker nodes alike. Worker nodes (or just "nodes") use it among other things to determine which containers they should run.
The following image (source) illustrates the elements that constitute a standard Kubernetes cluster:
Control Plane
The control plane provides the Kubernetes API and performs tasks like scheduling certain containers to certain nodes depending on user provided policies and resource constraints. Additionally it interacts with the cloud provider specific API to integrate Kubernetes into the given cloud environment. This is optional though and only relevant when Kubernetes is actually run in the cloud. For on-prem Kubernetes deployments this will be left out.
There are different options for how and where to run the control plane components. They can be run on physical or virtual machines separate from the nodes. They can be run on the same physical or virtual machines as the node components. There are even open source projects that allow running a Kubernetes control plane as a workload in another Kubernetes cluster.
Generally the control plane is deployed in a high availability (HA) setup. This is important as changes to the cluster or its workloads are impossible if the Kubernetes API is unavailable. For achieving high availability all control plane components can be scaled horizontally. As long as there are running multiple instances of each control plane components failure of a certain amount of instances can be tolerated without impacting functionality.
The following components are part of the control plane (additional details for each can be found here):
- kube-apiserver exposes the Kubernetes API.
- etcd provides a highly-available key value store used to store all cluster related data.
- kube-scheduler decides where to place workloads within the cluster.
- kube-controller-manager runs controller processes that respond to certain changes in the cluster or environment.
- cloud-controller-manager integrates with the cloud provider API (if applicable).
Node
The (worker) nodes are responsible for running the actual containerized workloads. To achieve this they communicate with the Kubernetes API to retrieve the list of containers (more precisely pods) that should be run on the given node. A Kubernetes cluster can scale from a few dozen to many hundreds of nodes. It is also possible to have nodes with many different resource configurations to serve different workloads. There might be cheap nodes with a low amount of CPU and RAM for undemanding workloads and more expensive nodes with faster CPU or even GPU resources for resource intensive workloads like machine learning. The control plane then takes care of assigning workloads to those nodes most appropriate for fulfilling their requirements.
Nodes run the following components:
- kubelet manages containers that the control plane determined to be run on the node like starting and stopping them.
- Container runtime is responsible for actually running the containers in the operating system. In early versions of Kubernetes Docker was used for this but this has now been replaced by alternatives like containerd.
- kube-proxy provides networking functionality to implement Kubernetes services.
Addons
The base functionality of Kubernetes can be extended in different ways by installing additional cluster components. One of the more important addons is cluster DNS. This is installed in VCPS and runs an additional internal DNS server that is used by pods to resolve internal domains for other pods and services. This can then be used among other things as a service discovery mechanism within the cluster.
More addons that are commonly installed in a Kubernetes cluster can be found here.
Networking Architecture
Providing networking functionality within a Kubernetes cluster is a complex task that consists of mainly 4 areas of concern:
- Enabling network connectivity between containers of the same pod. This is achieved in Kubernetes by running all
containers of the pod in the same network namespace which then allows connections to open container ports on
localhost
. - Allowing network traffic between pods of the cluster. This is the responsibility of the container runtime on the node which uses one of the available Container Network Interface (CNI) implementations. There is a huge selection of CNI implementations available. VCPS base clusters uses Cilium.
- Supporting pod-to-service communications. This is implemented by services using kube-proxy on the node.
- Providing access to cluster services from external sources. This is also provided by the services implementation.
An in-depth look at the Kubernetes network model can be found here.