Logging

Setup Logging Infrastructure

In this lab we are going to explore and poke around to make sure we know how logs from pods are handled in Kubernetes. Then we’ll set up an example centralized logging deployment to help collect all the logs from our cluster so they are searchable. Finally we’ll simulate a pod that doesn’t log to stdout and set it up so that those logs can still go into our logging setup.

Understanding Logging

Move to the master01 node

ssh master01

Let’s view some of the logs from the kubelet:

sudo journalctl -xe -u kubelet

Now let’s see where Kubernetes stores the logs from all the pods/containers that are started by the kubelet:

cd /var/log/containers
ls -lah

<output-omitted>

We can see that in this directory are softlinks to each active/live log file containing the stdout from each running pod. There is also a naming convention present here. We can also look at one specifically. For example:

sudo tail -f kube-proxy-48ft2_kube-system_kube-proxy-7ff5363303ae1641f1eee234efe452a718630bf365978c07a568d02b5a8e5b24.log

{"log":"W0920 07:28:14.174911       1 server_others.go:579] Unknown proxy mode \"\", assuming iptables proxy\n","stream":"stderr","time":"2020-09-20T07:28:14.174999925Z"}
{"log":"I0920 07:28:14.175245       1 server_others.go:186] Using iptables Proxier.\n","stream":"stderr","time":"2020-09-20T07:28:14.175290955Z"}
{"log":"I0920 07:28:14.175715       1 server.go:650] Version: v1.19.1\n","stream":"stderr","time":"2020-09-20T07:28:14.175763165Z"}
{"log":"I0920 07:28:14.176419       1 conntrack.go:52] Setting nf_conntrack_max to 131072\n","stream":"stderr","time":"2020-09-20T07:28:14.176466453Z"}
{"log":"I0920 07:28:14.181555       1 config.go:315] Starting service config controller\n","stream":"stderr","time":"2020-09-20T07:28:14.18161157Z"}
{"log":"I0920 07:28:14.181674       1 shared_informer.go:240] Waiting for caches to sync for service config\n","stream":"stderr","time":"2020-09-20T07:28:14.181758233Z"}
{"log":"I0920 07:28:14.185181       1 config.go:224] Starting endpoint slice config controller\n","stream":"stderr","time":"2020-09-20T07:28:14.185228732Z"}
{"log":"I0920 07:28:14.185273       1 shared_informer.go:240] Waiting for caches to sync for endpoint slice config\n","stream":"stderr","time":"2020-09-20T07:28:14.185369025Z"}
{"log":"I0920 07:28:14.281916       1 shared_informer.go:247] Caches are synced for service config \n","stream":"stderr","time":"2020-09-20T07:28:14.282111567Z"}
{"log":"I0920 07:28:14.285566       1 shared_informer.go:247] Caches are synced for endpoint slice config \n","stream":"stderr","time":"2020-09-20T07:28:14.287431675Z"}

You may optionally run the commands above on worker2 to check results there.. Now that we have poked around and are familiar with how things are logged behind the scenes. With this knowledge in our hand, let’s now put together a centralized logging solution to centralize and search all these pod logs so we don’t have to manually log into machines on a regular basis.

Setting Up Logging Infrastructure

Architecture / Background

We’re going to set up the following setup as an example of something you may use yourself for a centralized logging solution.

In the above we will deploy Filebeat as a DaemonSet on each node essentially tailing each of the logs from /var/log/containers and then shipping that data off to centralized storage using Elasticsearch. Finally there is a UI for Elasticsearch called Kibana that we will expose externally via an Ingress Controller and access it from our laptop.

Another very important part of this is we are placing all of these Services, Deployments, DaemonSets, etc. into a new separate namespace called logging. It’s important that we don’t pollute the reserved kube-system namespace. The idea here is that we could delete or remove the entire logging namespace and it would not affect any of the actual functioning of the cluster.

Before proceeding to the next step of actually deploying and accessing the above setup, please take some time to look over the supplied configuration to familiarize yourself with it. Ask questions to your instructor or peers if anything is not clear. The deployment YAML for this setup is located below:

Clone a repository from GitHub. If you already have, there’s no need to do it again. This repository will contain all the files we will use during our courses:

Return to the student desktop:

ssh student

then download the materials

cd ˜
git clone https://github.com/desotech-it/DSK201-public.git
cd DSK201-public

Cloning into 'DSK201-public'...

Apply the manifests:

kubectl apply -f logging/prerequisites/logging-ns.yaml

namespace/logging created

Change the pointer namespace of your context

kubectl config set-context --current --namespace logging

Check your StorageClass:

kubectl get sc

NAME                         PROVISIONER   RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
nfs-storageclass (default)   nfs-test      Delete          Immediate           false                  13h

Installa ElasticSearch

kubectl apply -f logging/elasticsearch/

poddisruptionbudget.policy/elasticsearch-master-pdb created
service/elasticsearch-master created
service/elasticsearch-master-headless created
statefulset.apps/elasticsearch-master created
service/elasticsearch-master-loadbalancer created

The Logging cluster may need few minutes to become Ready

kubectl get pods -w

NAME                     READY   STATUS    RESTARTS   AGE
elasticsearch-master-0   1/1     Running   0          118s
elasticsearch-master-1   1/1     Running   0          118s
elasticsearch-master-2   1/1     Running   0          118s

A three-node Elasticsearch cluster is now configured and available locally to the Kubernetes cluster. To confirm this, first port-forward a local port to the Elasticsearch service. Leave this command running in a terminal window or tab in the background for the remainder of this tutorial. Starting require a couple of minutes.

kubectl get svc

NAME                            TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)                         AGE
elasticsearch-master            ClusterIP      10.103.216.36   <none>         9200/TCP,9300/TCP               19m
elasticsearch-master-headless   ClusterIP      None            <none>         9200/TCP,9300/TCP               19m
elasticsearch-master-lb         LoadBalancer   10.97.98.144    10.10.95.203   9200:30813/TCP,9300:30426/TCP   9m48s

Use the External-IP address released from svc called elasticsearch-master-lb. In this example, it is:

curl http://10.10.95.203:9200/

An output similar to the following will appear:

{
  "name" : "elasticsearch-master-2",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "izmGtmzzSnmejirkSHt1ww",
  "version" : {
    "number" : "7.9.1",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "083627f112ba94dffc1232e8b42b73492789ef91",
    "build_date" : "2020-09-01T21:22:21.964974Z",
    "build_snapshot" : false,
    "lucene_version" : "8.6.2",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Note The specific version numbers and dates may be different in this JSON response. Elasticsearch is operational, but not receiving or serving any data.

Install Filebeat

In order to start processing data, deploy the filebeat yaml to the Kubernetes cluster. This collects all Pod logs and stores them in Elasticsearch, after which they can be searched and used in visualizations within Kibana.

kubectl apply -f logging/filebeat/

serviceaccount/filebeat-filebeat created
configmap/filebeat-filebeat-config created
clusterrole.rbac.authorization.k8s.io/filebeat-filebeat-cluster-role created
clusterrolebinding.rbac.authorization.k8s.io/filebeat-filebeat-cluster-role-binding created
daemonset.apps/filebeat-filebeat created

then wait for filebeat pods to become ready

kubectl get pods --namespace=logging -l app=filebeat-filebeat -w

NAME                      READY   STATUS    RESTARTS   AGE
filebeat-filebeat-8gqq8   1/1     Running   0          17s
filebeat-filebeat-9nkhp   1/1     Running   0          17s
filebeat-filebeat-mmn6v   1/1     Running   0          18s

Confirm that Filebeat has started to index documents into Elasticsearch by sending a request to the External-IP Elasticsearch service port in a different terminal:

curl http://10.10.95.203:9200/_cat/indices

At least one Filebeat index should be present, and output should be similar to the following:

green open filebeat-7.9.1-2020.09.25-000001 JJJlWvLvQOGXagS8w7AmUQ 1 1 4690 0 3.1mb 1.6mb

Install Kibana

Kibana provides a frontend to Elasticsearch and the data collected by Filebeat.

kubectl apply -f logging/kibana/

service/kibana-kibana created
deployment.apps/kibana-kibana created

kubectl get svc --namespace=logging -l app=kibana

NAME            TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)          AGE
kibana-kibana   LoadBalancer   10.102.184.77   10.10.99.124   5601:32614/TCP   118s

Configure Kibana

Before visualizing Pod logs, Kibana must be configured with an index pattern for Filebeat’s indices.

Open a browser and connect with your assigned External-IP http://10.10.95.204:5601/

A welcome page similar to the following appears in the browser. Click the Explore on my own button.

Open the menu, then go to Stack Management > Kibana > Index Patterns to create a new index pattern. The Index patterns page appears.

Click the Create index pattern button to begin.

In the Define index pattern window, type filebeat-* in the Index pattern text box and click the Next step button.

In the Configure settings window, select @timestamp from the Time Filter field name dropdown menu and click the Create index pattern button.

A page with the index pattern details appears. Open the menu, then go to Kibana > Discover to view incoming logs.

The Discover page provides a realtime view of logs as they are ingested by Elasticsearch from the Kubernetes cluster. The histogram provides a view of log volume over time, which by default, spans the last 15 minutes. The sidebar on the left side of the user interface displays various fields parsed from JSON fields sent by Filebeat to Elasticsearch.

Use the Filters box to search only for logs arriving from Kibana Pods by filtering for kubernetes.container.name : "kibana". Click the Update button to apply the search filter.

Note: When searching in the filters box, field names and values are auto-populated. Check the language used, must be Lucene

In order to expand a log event, click the arrow next to an event in the user interface.

Scroll down to view the entire log document in Kibana. Observe the fields provided by Filebeat, including the message field, which contains standard out and standard error messages from the container, as well as the Kubernetes node and Pod name in fields prefixed with kubernetes.

Look closely at the message field in the log representation and note that the text field is formatted as JSON. While the terms in this field can be searched with free text search terms in Kibana, parsing the field generally yields better results.

Next let’s search for controller and expand the row to look at all the fields. Notice we see not only the logs here from our pods, but we also have all the kubernetes data parsed out into separate fields:

Feel free to continue to explore and play with the Kibana interface to do searches and build visualizations and dashboards. We won’t go into the details of how to use Kibana here but there are plenty of resources available for that if you would like to explore more.

Capturing Logs Not Using stdout

So far we have captured logs from images/containers that were designed to have all their logs go to stdout. However if we have an application that is writing to log files within the container, we can use the following approach to collect them. To do this we are going to use a pod with three containers. The main container is writing logs to two different files inside the container. The other two containers will tail those logs and then write them to stdout.

cat logging/streaming-log/streaming-log.yaml
kubectl apply -f logging/streaming-log/streaming-log.yaml

pod/streaming-log created

Next head back over to our Kibana interface and do a search for streaming-log and we now see the logs from the files:

From Available fields you can select kubernetes.pod.name and add the streaming-log pod

When we are done exploring scenarios where pods log to places other than stdout/stderr, let’s clean up the streaming sidecar pod.

kubectl delete -f logging/streaming-log/streaming-log.yaml

pod "streaming-log" deleted