423 lines
16 KiB
Markdown
423 lines
16 KiB
Markdown
<title>Kubernetes</title>
|
|
<style>
|
|
body {
|
|
margin: 50px 25% auto;
|
|
font-family: hack,inconsolata,monospace;
|
|
}
|
|
|
|
h2 {
|
|
margin-left: 15px;
|
|
}
|
|
|
|
h3 {
|
|
margin-left: 25px;
|
|
}
|
|
|
|
h4 {
|
|
margin-left: 45px;
|
|
}
|
|
</style>
|
|
|
|
# What is Kubernetes?
|
|
|
|
Kubernetes (abbreviated to k8s) is a collection of different software components for
|
|
managing containers and infrastructure around containers. These
|
|
components talk to eachother using kubernetes apis, and each is made
|
|
for a specific task in managing your cluster, be it managing container
|
|
scheduling, storage, networking, ssl-certificates, etc.
|
|
|
|
Each type of component may be provided by different software.
|
|
Collections of these types of software are often provided together as
|
|
a *distribution* of kubernetes.
|
|
|
|
Some distributions have more components, and others may be more
|
|
minimalistic. Some are self-hostable, like
|
|
[openshift](https://www.redhat.com/en/technologies/cloud-computing/openshift),
|
|
[k3s](https://k3s.io) or [k0s](https://k0sproject.io), while others
|
|
might be cloud-only, like [googles gke](https://cloud.google.com/kubernetes-engine/),
|
|
or [linode k8s](https://www.linode.com/products/kubernetes/).
|
|
|
|
The self hosted ones usually allow you to disable included components,
|
|
or install third-party components to fit your needs, while cloud-only
|
|
ones might limit you to only their provided components.
|
|
|
|
# The components
|
|
|
|
## Kubernets control plane
|
|
|
|
This is the main brain of kubernetes, it provides the kubernetes
|
|
control endpoint which other components use to talk to each other.
|
|
|
|
This is usually the one component of your kubernetes distribution that
|
|
can't be changed.
|
|
|
|
In kubernetes, we separate our machines into controller/master and
|
|
worker/agent nodes, where only the master nodes run the control
|
|
planes, and the worker nodes only run workloads determined by the
|
|
master nodes. Note that many self hosted distributions also let you
|
|
run a worker node as part of your master node, so it doesn't have to
|
|
be on a separate machine.
|
|
|
|
In cloud-based kubernetes, the hosting provider manages nodes for you,
|
|
so you will likely not interact with them at all and won't know how
|
|
many or how they are distributed.
|
|
|
|
Your cluster can have a single, or multiple master nodes running
|
|
control planes. Multiple master nodes will give your cluster
|
|
resilience against downtime, as if the master node goes down, your
|
|
cluster can no longer schedule work on any of the worker nodes. If
|
|
you run multiple master nodes, changes between them are synced, so
|
|
that if one goes down, the others will keep working and your cluster
|
|
will keep working.
|
|
|
|
## Control plane datastore
|
|
|
|
The controlplane in the previous section needs to store date about the
|
|
configuration somewhere, and that is in the control plane datastore.
|
|
|
|
The datastore can be anything from a simple database, to an advanced
|
|
clustered highly available database setup for resiliency. Most
|
|
distributions will come with a sensible default, but you can use your
|
|
own if you have an external database already, like mysql or postgres.
|
|
Note that which datastores you can use will depend on what your
|
|
distribution supports. Refer to your distributions documentation for
|
|
more information.
|
|
|
|
Distributions usually ship with [Etcd](https://etcd.io/), a clustered database for high
|
|
availability, or sqlite, for single-node setups.
|
|
|
|
## Kubelet
|
|
|
|
A [kubelet](https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/)
|
|
is the component responsible for managing containers on each node.
|
|
|
|
If you are coming from docker, you can think of kubelet almost like
|
|
docker-compose, except that instead of from a file, it reads
|
|
configurations from the control plane and creates containers based on
|
|
that.
|
|
|
|
## Container runtime
|
|
|
|
This is the system daemon responsible for actually running your
|
|
containers, most distributions now use containerD, but some allow you
|
|
to use dockerD instead if you have a special use-case.
|
|
|
|
Usually you would stick with containerD.
|
|
|
|
## Container Network Interface (CNI)
|
|
|
|
Unlike docker, which contains its own networking tools, kubernetes
|
|
offloads this to external tools, which means you can choose the one
|
|
that fits your needs.
|
|
|
|
Each distribution will come with a default provider, usually with
|
|
options to disable it if you'd like to install your own.
|
|
|
|
The CNI is responsible for connecting your containers networks to each
|
|
other and to the outside world.
|
|
|
|
Examples of CNIs are:
|
|
|
|
- [flannel](https://github.com/flannel-io/flannel) (the default for k3s)
|
|
- [kube-router](https://www.kube-router.io/) (the default for k0s)
|
|
- [Calico](https://projectcalico.docs.tigera.io/about/about-calico)
|
|
- [Cilium](https://docs.cilium.io/en/v1.9/)
|
|
- [Weave](https://www.weave.works/docs/net/latest/overview/)
|
|
|
|
## Container Storage Interface (CSI) drivers
|
|
|
|
The CSI driver is responsible for managing stateful storage for your containers.
|
|
|
|
It's job is to abstract away where the data is, so that your container
|
|
just needs to worry about its own volumes, instead of what medium its
|
|
physically stored on.
|
|
|
|
Implementations can range from dead-simple drivers which just map a
|
|
volume to a folder in local storage, to ones serving the data over the
|
|
network and automatically setting up redundancy and backups.
|
|
|
|
You can have multiple storage drivers installed at the same time, and
|
|
choose between them when creating your workloads.
|
|
|
|
Some examples of storage providers are:
|
|
|
|
- k3s [local-path](https://github.com/rancher/local-path-provisioner) driver (default dead-simple provider from k3s)
|
|
- [openebs](https://openebs.io/) (can support multiple storage backends, like nfs for
|
|
network storage, or zfs for interfacing with a host zfs system)
|
|
- rancher [longhorn](https://rancher.com/products/longhorn) (supports redundancy and backups)
|
|
|
|
## Network Load Balancers
|
|
|
|
While CNIs will manage most of your networking needs automatically,
|
|
sometimes you'd like to better manage how outside traffic enters your
|
|
cluster.
|
|
|
|
Enter the Load Balancer.
|
|
|
|
It can either be cluster internal, or be external as a separate
|
|
machine.
|
|
|
|
Its job is to route traffic coming from the outside into your cluster.
|
|
|
|
Most load balancers route traffic by assigning an IP to each service,
|
|
and routing traffic through that IP to the correct pod based on
|
|
criteria such as load, availability, etc.
|
|
|
|
However some Load Balancers, such as k3s's KlipperLB, or MetalLB, lets
|
|
services share IPs, so you can re-use them if you have a limited set
|
|
of public IPs.
|
|
|
|
Some examples of Load Balancers:
|
|
|
|
- [KlipperLB](https://github.com/k3s-io/klipper-lb)(default in k3s)
|
|
- [MetalLB](https://metallb.org/)
|
|
- [Kube-vip](https://kube-vip.io/)
|
|
- [Kube-router](https://www.kube-router.io/) (example of a CNI which also does Load Balancing)
|
|
|
|
## Cluster DNS
|
|
|
|
A cluster needs its own internal dns provider, for the containers to
|
|
talk to eachother with, as their IPs may change as they get moved
|
|
around.
|
|
|
|
A common example is [coredns](coredns.io) (default in k3s and k0s).
|
|
|
|
## Extra components
|
|
|
|
While the above components will let you have a fully functional
|
|
kubernetes cluster, there are a few extra components which can be very
|
|
nice to have.
|
|
|
|
### Ingress controller
|
|
|
|
If you serve a lot of services, you might want all of them to be
|
|
reachable from a single IP:port combination.
|
|
|
|
This is where an ingress controller comes in.
|
|
|
|
Its essentially a reverse-proxy which you can configure via the control plane.
|
|
|
|
It can also handle things like http middleware such as authentication,
|
|
IP blocklists, real-ip forwarding etc. Some ingresses (like nginx)
|
|
also support proxying arbitrary tcp/udp traffic, like a load balancer.
|
|
|
|
Some examples of ingress controllers are:
|
|
|
|
- [Traefik](https://doc.traefik.io/traefik/providers/kubernetes-ingress/) (default in k3s)
|
|
- [Nginx](https://kubernetes.github.io/ingress-nginx/)
|
|
|
|
### SSL Certificate manager
|
|
|
|
You can also use kubernetes to manage,acquire,distribute and renew ssl certificates.
|
|
|
|
This lets you define certificates via configuration in the control
|
|
plane, instead of having to deal with them as files.
|
|
|
|
A common example of this is [cert-manager](https://cert-manager.io)
|
|
|
|
### Container registry
|
|
|
|
If you want to build you own private container images and run them in
|
|
your cluster, you'll probably want to setup a container registry.
|
|
|
|
A common example is ["Docker Registry"](https://hub.docker.com/_/registry/).
|
|
|
|
### A management gui
|
|
|
|
While using kubectl from the terminal to manage your cluster can be a
|
|
perfectly workable solution, it can be nice to have a graphical ui for
|
|
management.
|
|
|
|
Some distributions like openshift, and most cloud-based distributions
|
|
already come with this out of the box, but you can also install your
|
|
own.
|
|
|
|
Some examples are:
|
|
|
|
- [rancher](https://rancher.com/) (web based)
|
|
- [kubernetes-dashboard](https://github.com/kubernetes/dashboard) (web based)
|
|
- [lens](https://k8slens.dev/) (desktop app)
|
|
- [k9s](https://k9scli.io/) (terminal-ui)
|
|
|
|
### Logging/Metrics system
|
|
|
|
While the controlplane lets you view logs from each pod, a good
|
|
logging system can make it easier to discover and diagnose issues in
|
|
your cluster.
|
|
|
|
Some examples are:
|
|
|
|
- [Prometheus](https://prometheus.io/)
|
|
- [Grafana Loki](https://grafana.com/oss/loki/)
|
|
- [Greylog](https://www.graylog.org/)
|
|
- [ELK](https://www.elastic.co/what-is/elk-stack)
|
|
- [Splunk](https://www.splunk.com/)
|
|
|
|
|
|
# Kubernetes configuration
|
|
|
|
As described in the section about the control plane component,
|
|
kubernetes uses configuration stored in the control plane to tell the
|
|
components what to do.
|
|
|
|
This configuration comes in the form of different kinds of
|
|
data-objects. Kubernetes ships a large list of built-in kinds of
|
|
objects, but components can also add their own, for example a storage
|
|
component would add a "storage volume" kind, or an ingress controller
|
|
would add an "ingress-config" kind.
|
|
|
|
These configuration objects are usually represented as yaml, but can
|
|
also be in other formats like json depending the application.
|
|
|
|
common properties found on all config objects are the `apiVersion` and
|
|
`kind` fields, these tell kubernetes what kind the object is, and
|
|
therefore how to parse the rest of the object.
|
|
|
|
## The structure of a k8s object
|
|
|
|
A simple k8s object defined in yaml will usually look like this:
|
|
|
|
apiVersion: v1
|
|
kind: <kind>
|
|
metadata:
|
|
name: <object name>
|
|
namespace: <namespace>
|
|
annotations:
|
|
<annotations>
|
|
labels:
|
|
<labels>
|
|
spec:
|
|
<object data>
|
|
|
|
The `apiVersion` and `kind` fields are always present, and tells k8s
|
|
what kind of object this is, which will determine what its `spec`
|
|
should look like, which contains the actual configuration itself.
|
|
|
|
The `metadata` section contains metadata about the object, such as its
|
|
name and namespace. A few kinds of objects can exist without a
|
|
namespace, and are cluster wide, but in general all objects will exist
|
|
in a namespace.
|
|
|
|
The `annotations` and `labels` sections in `metadata` are optional,
|
|
and are used by other k8s components to modify how the object should
|
|
be handled. This could be things like setting an annotation specifying
|
|
a service should only be reachable from a certain range of ips, or
|
|
setting a label to identify a group of pods as part of the same
|
|
application, so they can all be referenced together.
|
|
|
|
## The built in object kinds
|
|
|
|
The built in kubernetes objects will usually be found in `apiVersion:
|
|
v1`, or `apiVersion: apps/v1` while other components will usually
|
|
version their objects prefixed with their name, like `apiVersion:
|
|
cert-manager.io/v1` for cert-manager objects.
|
|
|
|
### Namespace
|
|
|
|
A namespace is a config object for managing other config objects, it
|
|
will group other config objects together, so they are easier to reason
|
|
about.
|
|
|
|
When you first create a new cluster, you will most likely have 1 or 2
|
|
namespaces to start with: `default` and `kube-system`. If you are
|
|
using a cloud-based cluster provider, you might not have access to the
|
|
`kube-system` namespace, as it is used for kubernetes components, and
|
|
not your own services.
|
|
|
|
### Pod
|
|
|
|
A pod is the most basic container building block in k8s. A pod
|
|
represents a set of containers to run together, as well as the
|
|
resources bound to them, like volume mounts, config mounts,
|
|
environment variables and open ports.
|
|
|
|
When a pod is added to your cluster the controlplane will find an
|
|
available node to schedule it on, the kubelet on that node will then
|
|
create the containers the pod needs, configure them and start them up.
|
|
|
|
If you are familiar with docker, a pod is similair to the `services`
|
|
section in a docker-compose file.
|
|
|
|
### Service
|
|
|
|
Instead of exposing ports in pods directly, k8s uses a separate kind
|
|
of object called a service to define exposed ports for your pods.
|
|
|
|
A service creates an IP address, either internal to your cluster, or
|
|
external depending on the type of service, and a port.
|
|
|
|
Services are tied to pods through a `podSelector` field in the
|
|
service, which tells the service what pods to bind to.
|
|
This means that a service can bind the same port to multiple pods, and
|
|
let k8s route the traffic to the one desired.
|
|
|
|
In this way, you can deploy multiple replicas of the same pod to
|
|
different nodes, and point a single service to all of them. For
|
|
example deploying the same web app to all your nodes, all responding
|
|
on port 3000, and then making a service to bind port 3000 of the
|
|
service to all the pods running on all the nodes. You can then reach
|
|
the web app through the service ip and port, instead of having to
|
|
choose one of the pods manually.
|
|
|
|
There are different types of service objects, for different methods of
|
|
exposing the underlying service.
|
|
|
|
#### type: ClusterIP:
|
|
|
|
This is the default kind of service. It exposes the underlying port on
|
|
a cluster internal IP, which means it will only be accessible inside
|
|
your cluster. This is useful for internal only services like
|
|
databases, caches or background services which do not need to be
|
|
exposed to the outside world.
|
|
|
|
#### type: NodePort:
|
|
|
|
NodePort binds a port to an available port on every node. The
|
|
available ports are specified in a range in the cluster configuration,
|
|
like 30000-32000. The port it will use can either be specified
|
|
manually or it can pick a random free one from the range. If the
|
|
manually specified port is not available or is outside of the valid
|
|
range, k8s will either pick another random free port inside the valid
|
|
range, or not create the port.
|
|
|
|
NodePorts are useful for when you need to export a service to the
|
|
outside world, but it doesn't need to be a specific port.
|
|
|
|
#### type: Load Balancer:
|
|
|
|
Load Balancer services uses your cluster load balancer to create a new
|
|
external ip and bind the service port to it.
|
|
|
|
Different load balancer implementations will do this in different
|
|
ways, so consult the documentation for your selected load balancer.
|
|
|
|
This is usually the way you will expose services that require specific
|
|
port to the outside world, like http services, mail, dns, irc, etc.
|
|
|
|
Usually Load Balancers will always use a new external IP, however some
|
|
let you configure re-use of IP addresses if the ports are free.
|
|
|
|
For example, [KlipperLB](https://github.com/k3s-io/klipper-lb) from
|
|
k3s will always use the nodes external IP, like a NodePort, and will
|
|
create pods on every node to redirect traffic to the correct node if
|
|
it enters to the wrong one.
|
|
|
|
Or MetalLB, which lets you tag Load Balancer services with a key, and
|
|
every other Load Balancer service sharing the same key can share the
|
|
same external IP as long as the ports don't collide.
|
|
[Doc Link](https://metallb.universe.tf/usage/#ip-address-sharing).
|
|
|
|
#### Service DNS
|
|
|
|
In addition to exposing the service with an IP and ports, k8s Services
|
|
also creates an entry in the cluster dns server, so you can adress
|
|
them by name instead of ip, as the IP may change over time.
|
|
|
|
This will be in the form: `<service_name>.<namespace
|
|
name>.svc.cluster.local`, so if your service is called `my-webapp`,
|
|
and is in the namespace `default`, the dns name will be:
|
|
`my-webapp.default.svc.cluster.local`.
|
|
|
|
|
|
For a much more in-depth explanation of services, consult the [official kubernetes docs](https://kubernetes.io/docs/concepts/services-networking/service/).
|