Make sure to check out the Glossary before continuing.
KubeVirt consists of a set of services:
|
Cluster | (virt-controller)
|
------------+---------------------------------------
|
Kubernetes | (VM CRD)
|
------------+---------------------------------------
|
DaemonSet | (virt-handler) (vm-pod M)
|
M: Managed by KubeVirt
CRD: Custom Resource Definition
The following flow illustrates the communication flow between several
(not all) components present in KubeVirt.
In general the communication pattern can be considered to be a
choreography, where all components act by themselves to realize the state
provided by the VM
objects.
Client K8s API VM CRD Virt Controller VM Handler
-------------------------- ----------- ------- ----------------------- ----------
listen <----------- WATCH /virtualmachines
listen <----------------------------------- WATCH /virtualmachines
| |
POST /virtualmachines ---> validate | |
create ---> VM ---> observe --------------> observe
| | v v
validate <--------- POST /pods defineVM
create | | |
| | | |
schedPod ---------> observe |
| | v |
validate <--------- PUT /virtualmachines |
update ---> VM ---------------------------> observe
| | | launchVM
| | | |
: : : :
| | | |
DELETE /virtualmachines -> validate | | |
delete ----> * ---------------------------> observe
| | shutdownVM
| | |
: : :
Disclaimer: The diagram above is not completely accurate, because there are temporary workarounds in place to avoid bugs and address some other stuff.
- A client posts a new VM definition to the K8s API Server.
- The K8s API Server validates the input and creates a
VM
custom resource definition (CRD) object. - The
virt-controller
observes the creation of the newVM
object and creates a corrsponding pod. - Kubernetes is scheduling the pod on a host.
- The
virt-controller
observes that a pod for theVM
got started and updates thenodeName
field in theVM
object. Now that thenodeName
is set, the responsibility transitions to thevirt-handler
for any further action. - The
virt-handler
(DaemonSet) observes that aVM
got assigned to the host where it is running on. - The
virt-handler
is using the VM Specification and signals the creation of the corresponding domain using alibvirtd
instance in the VM's pod. - A client deletes the
VM
object through thevirt-api-server
. - The
virt-handler
observes the deletion and turns off the domain.
HTTP API server which serves as the entry point for all virtualization related flows.
The API Server is taking care to update the virtualization related custom resource definition (see below).
As the main entrypoint to KubeVirt it is responsible for defaulting and validation of the provided VM CRDs.
VM definitions are kept as custom resource definitions inside the Kubernetes API server.
The VM definition is defining all properties of the Virtual machine itself, for example
- Machine type
- CPU type
- Amount of RAM and vCPUs
- Number and type of NICs
- …
From a high-level perspective the virt-controller has all the cluster wide virtualization functionality.
This controller is responsible for monitoring the VM (CRDs) and managing the associated pods. Currently the controller will make sure to create and manage the life-cycle of the pods associated to the VM objects.
A VM object will always be associated to a pod during it's life-time, however, due to i.e. migration of a VM the pod instance might change over time.
For every VM object one pod is created. This pod's primary container runs the
virt-launcher
KubeVirt component.
Kubernetes or the kubelet is not running the VMs itself. Instead a daemon on every host in the cluster will take care to launch a VM process for every pod which is associated to a VM object whenever it is getting scheduled on a host.
The main purpose of the virt-launcher
Pod is to provide the cgroups and
namespaces, which will be used to host the VM process.
virt-handler
signals virt-launcher
to start a VM by passing the VM's CRD object
to virt-launcher
. virt-launcher
then uses a local libvirtd instance within its
container to start the VM. From there virt-launcher
monitors the VM process and
terminates once the VM has exited.
If the Kubernetes runtime attempts to shutdown the virt-launcher
pod before the
VM has exited, virt-launcher
forwards signals from Kubernetes to the VM
process and attempts to hold off the termination of the pod until the VM has
shutdown successfully.
Every host needs a single instance of virt-handler
. It can be delivered as a DaemonSet.
Like the virt-controller
, the virt-handler
is also reactive and is watching for
changes of the VM object, once detected it will perform all necessary
operations to change a VM to meet the required state.
This behavior is similar to the choreography between the Kubernetes API Server and the kubelet.
The main areas which virt-handler
has to cover are:
- Keep a cluster-level VM spec in sync with a corresponding libvirt domain.
- Report domain state and spec changes to the cluster.
- Invoke node-centric plugins which can fulfill networking and storage requirements defined in VM specs.
An instance of libvirtd
is present in every VM pod. virt-launcher
uses libvirtd
to manage the life-cycle of the VM process.
The components above are essential to deliver core virtualization functionality in your cluster. However fully featured virtual machines require more than just plain virtualization functionality. Beyond virtualization they also require reliable storage and networking functionality to be fully usable.
The components below will be providing this additional functionality if the functionality is not provided by kubernetes itself.
We will try to leverage as much of Kubernetes regarding to mounting and preparing images for VM.
However, virt-handler
may provide a plugin mechanism to allow storage mounting and setup from the host, if the KubeVirt requirements do not fit into the Kubernetes storage scenarios.
Since host side preparation of storage may not be enough, a cluster-wide Storage Controller can be used to prepare storage.
Investigations are still in progress.
Such a controller will not be part of KubeVirt itself.
However KubeVirt might define a Storage CRD along side with a flow description which will allow such a controller seamless integration into KubeVirt.
We will try to leverage as much of Kubernetes networking plugin mechanisms (e.g. CNI).
However, virt-handler
may provide a plugin mechanism to allow network setup on a host, if the KubeVirt requirements do not fit into the Kubernetes storage scenarios.
Since host side preparation of network interfaces may not be enough, a cluster-wide Network Controller can be used to prepare the network.
Investigations are still in progress.
Such a controller will not be part of KubeVirt itself.
However KubeVirt might define a Networking CRD along side with a flow description which will allow such a controller seamless integration into KubeVirt.