Logging
Logging
Setup Logging Infrastructure
In this lab we are going to explore and poke around to make sure we know how logs from pods are handled in Kubernetes.
Then we’ll set up an example centralized logging deployment to help collect all the logs from our cluster so they are
searchable.
Finally we’ll simulate a pod that doesn’t log to stdout
and set it up so that those logs can still go into
our logging setup.
Understanding Logging
Move to the master01
node
ssh master01
Let’s view some of the logs from the kubelet
:
sudo journalctl -xe -u kubelet
Now let’s see where Kubernetes stores the logs from all the pods/containers that are started by the kubelet
:
cd /var/log/containers
ls -lah
<output-omitted>
We can see that in this directory are softlinks to each active/live log file containing the stdout
from each running
pod. There is also a naming convention present here. We can also look at one specifically. For example:
sudo tail -f kube-proxy-48ft2_kube-system_kube-proxy-7ff5363303ae1641f1eee234efe452a718630bf365978c07a568d02b5a8e5b24.log
{"log":"W0920 07:28:14.174911 1 server_others.go:579] Unknown proxy mode \"\", assuming iptables proxy\n","stream":"stderr","time":"2020-09-20T07:28:14.174999925Z"} {"log":"I0920 07:28:14.175245 1 server_others.go:186] Using iptables Proxier.\n","stream":"stderr","time":"2020-09-20T07:28:14.175290955Z"} {"log":"I0920 07:28:14.175715 1 server.go:650] Version: v1.19.1\n","stream":"stderr","time":"2020-09-20T07:28:14.175763165Z"} {"log":"I0920 07:28:14.176419 1 conntrack.go:52] Setting nf_conntrack_max to 131072\n","stream":"stderr","time":"2020-09-20T07:28:14.176466453Z"} {"log":"I0920 07:28:14.181555 1 config.go:315] Starting service config controller\n","stream":"stderr","time":"2020-09-20T07:28:14.18161157Z"} {"log":"I0920 07:28:14.181674 1 shared_informer.go:240] Waiting for caches to sync for service config\n","stream":"stderr","time":"2020-09-20T07:28:14.181758233Z"} {"log":"I0920 07:28:14.185181 1 config.go:224] Starting endpoint slice config controller\n","stream":"stderr","time":"2020-09-20T07:28:14.185228732Z"} {"log":"I0920 07:28:14.185273 1 shared_informer.go:240] Waiting for caches to sync for endpoint slice config\n","stream":"stderr","time":"2020-09-20T07:28:14.185369025Z"} {"log":"I0920 07:28:14.281916 1 shared_informer.go:247] Caches are synced for service config \n","stream":"stderr","time":"2020-09-20T07:28:14.282111567Z"} {"log":"I0920 07:28:14.285566 1 shared_informer.go:247] Caches are synced for endpoint slice config \n","stream":"stderr","time":"2020-09-20T07:28:14.287431675Z"}
You may optionally run the commands above on worker2
to check results there..
Now that we have poked around and are familiar with how things are logged behind the scenes.
With this knowledge in our hand, let’s now put together a centralized logging solution to centralize and search all
these pod logs so we don’t have to manually log into machines on a regular basis.
Setting Up Logging Infrastructure
Architecture / Background
We’re going to set up the following setup as an example of something you may use yourself for a centralized logging solution.
In the above we will deploy Filebeat
as a DaemonSet
on each node essentially tailing each of the logs from
/var/log/containers
and then shipping that data off to centralized storage using Elasticsearch
. Finally there is a
UI for Elasticsearch called Kibana
that we will expose externally via an Ingress Controller and access it from our
laptop.
Another very important part of this is we are placing all of these Services, Deployments, DaemonSets, etc. into a new
separate namespace called logging
. It’s important that we don’t pollute the reserved kube-system
namespace.
The idea here is that we could delete or remove the entire logging namespace and it would not affect any of the actual
functioning of the cluster.
Before proceeding to the next step of actually deploying and accessing the above setup, please take some time to look over the supplied configuration to familiarize yourself with it. Ask questions to your instructor or peers if anything is not clear. The deployment YAML for this setup is located below:
Clone a repository from GitHub. If you already have, there’s no need to do it again. This repository will contain all the files we will use during our courses:
Return to the student
desktop:
ssh student
then download the materials
cd ˜
git clone https://github.com/desotech-it/DSK201-public.git
cd DSK201-public
Cloning into 'DSK201-public'...
Apply the manifests:
kubectl apply -f logging/prerequisites/logging-ns.yaml
namespace/logging created
Change the pointer namespace of your context
kubectl config set-context --current --namespace logging
Check your StorageClass
:
kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE nfs-storageclass (default) nfs-test Delete Immediate false 13h
Installa ElasticSearch
kubectl apply -f logging/elasticsearch/
poddisruptionbudget.policy/elasticsearch-master-pdb created service/elasticsearch-master created service/elasticsearch-master-headless created statefulset.apps/elasticsearch-master created service/elasticsearch-master-loadbalancer created
The Logging cluster may need few minutes to become Ready
kubectl get pods -w
NAME READY STATUS RESTARTS AGE elasticsearch-master-0 1/1 Running 0 118s elasticsearch-master-1 1/1 Running 0 118s elasticsearch-master-2 1/1 Running 0 118s
A three-node Elasticsearch
cluster is now configured and available locally to the Kubernetes cluster.
To confirm this, first port-forward a local port to the Elasticsearch
service.
Leave this command running in a terminal window or tab in the background for the remainder of this tutorial.
Starting require a couple of minutes.
kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE elasticsearch-master ClusterIP 10.103.216.36 <none> 9200/TCP,9300/TCP 19m elasticsearch-master-headless ClusterIP None <none> 9200/TCP,9300/TCP 19m elasticsearch-master-lb LoadBalancer 10.97.98.144 10.10.95.203 9200:30813/TCP,9300:30426/TCP 9m48s
Use the External-IP
address released from svc
called elasticsearch-master-lb
.
In this example, it is:
curl http://10.10.95.203:9200/
An output similar to the following will appear:
{
"name" : "elasticsearch-master-2",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "izmGtmzzSnmejirkSHt1ww",
"version" : {
"number" : "7.9.1",
"build_flavor" : "default",
"build_type" : "docker",
"build_hash" : "083627f112ba94dffc1232e8b42b73492789ef91",
"build_date" : "2020-09-01T21:22:21.964974Z",
"build_snapshot" : false,
"lucene_version" : "8.6.2",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
Note The specific version numbers and dates may be different in this JSON response.
Elasticsearch
is operational, but not receiving or serving any data.
Install Filebeat
In order to start processing data, deploy the filebeat
yaml to the Kubernetes cluster.
This collects all Pod logs and stores them in Elasticsearch
, after which they can be searched and used in
visualizations within Kibana
.
kubectl apply -f logging/filebeat/
serviceaccount/filebeat-filebeat created configmap/filebeat-filebeat-config created clusterrole.rbac.authorization.k8s.io/filebeat-filebeat-cluster-role created clusterrolebinding.rbac.authorization.k8s.io/filebeat-filebeat-cluster-role-binding created daemonset.apps/filebeat-filebeat created
then wait for filebeat pods to become ready
kubectl get pods --namespace=logging -l app=filebeat-filebeat -w
NAME READY STATUS RESTARTS AGE filebeat-filebeat-8gqq8 1/1 Running 0 17s filebeat-filebeat-9nkhp 1/1 Running 0 17s filebeat-filebeat-mmn6v 1/1 Running 0 18s
Confirm that Filebeat
has started to index documents into Elasticsearch
by sending a request to the External-IP
Elasticsearch
service port in a different terminal:
curl http://10.10.95.203:9200/_cat/indices
At least one Filebeat
index should be present, and output should be similar to the following:
green open filebeat-7.9.1-2020.09.25-000001 JJJlWvLvQOGXagS8w7AmUQ 1 1 4690 0 3.1mb 1.6mb
Install Kibana
Kibana provides a frontend to Elasticsearch
and the data collected by Filebeat
.
kubectl apply -f logging/kibana/
service/kibana-kibana created deployment.apps/kibana-kibana created
kubectl get svc --namespace=logging -l app=kibana
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kibana-kibana LoadBalancer 10.102.184.77 10.10.99.124 5601:32614/TCP 118s
Configure Kibana
Before visualizing Pod logs, Kibana
must be configured with an index pattern for Filebeat
’s indices.
Open a browser and connect with your assigned External-IP
http://10.10.95.204:5601/
A welcome page similar to the following appears in the browser. Click the Explore on my own
button.
Open the menu, then go to Stack Management
> Kibana
> Index Patterns
to create a new index pattern.
The Index patterns
page appears.
Click the Create index pattern
button to begin.
In the Define index pattern
window, type filebeat-*
in the Index pattern
text box and click the Next step
button.
In the Configure settings
window, select @timestamp from the Time Filter field
name dropdown menu and click the
Create index pattern button
.
A page with the index pattern details appears. Open the menu, then go to Kibana > Discover
to view incoming logs.
The Discover page provides a realtime view of logs as they are ingested by Elasticsearch
from the Kubernetes cluster.
The histogram provides a view of log volume over time, which by default, spans the last 15 minutes. The sidebar on the
left side of the user interface displays various fields parsed from JSON fields sent by Filebeat
to Elasticsearch
.
Use the Filters
box to search only for logs arriving from Kibana Pods by filtering for kubernetes.container.name :
"kibana"
. Click the Update
button to apply the search filter.
Note: When searching in the filters box, field names and values are auto-populated. Check the language used, must be
Lucene
In order to expand a log event, click the arrow next to an event in the user interface.
Scroll down to view the entire log document in Kibana. Observe the fields provided by Filebeat
, including the
message
field, which contains standard out and standard error messages from the container, as well as the Kubernetes
node and Pod name in fields prefixed with kubernetes
.
Look closely at the message
field in the log representation and note that the text field is formatted as JSON.
While the terms in this field can be searched with free text search terms in Kibana, parsing the field generally yields
better results.
Next let’s search for controller
and expand the row to look at all the fields. Notice we see not only the logs here
from our pods, but we also have all the kubernetes data parsed out into separate fields:
Feel free to continue to explore and play with the Kibana interface to do searches and build visualizations and dashboards. We won’t go into the details of how to use Kibana here but there are plenty of resources available for that if you would like to explore more.
Capturing Logs Not Using stdout
So far we have captured logs from images/containers that were designed to have all their logs go to stdout
.
However if we have an application that is writing to log files within the container, we can use the following approach
to collect them. To do this we are going to use a pod with three containers. The main container is writing logs to two
different files inside the container. The other two containers will tail those logs and then write them to stdout
.
cat logging/streaming-log/streaming-log.yaml
kubectl apply -f logging/streaming-log/streaming-log.yaml
pod/streaming-log created
Next head back over to our Kibana
interface and do a search for streaming-log
and we now see the logs from the
files:
From Available fields
you can select kubernetes.pod.name
and add the streaming-log
pod
When we are done exploring scenarios where pods log to places other than stdout
/stderr
, let’s clean up the
streaming sidecar pod.
kubectl delete -f logging/streaming-log/streaming-log.yaml
pod "streaming-log" deleted