EmptyDir

Kubernetes volumes address the issue of storage being coupled to the ephemeral nature of a container. Specifically, the persistence and lifecycle of a Kubernetes volume is abstracted from the container’s lifecycle. The main benefit of the Kubernetes volume is that the data survives container restarts. The kind of lifecycle of a Kubernetes volume, and, by extension, the data within it is contingent on the type of volume selected. Additionally, in current Kubernetes versions, cluster admins can present various external storage options to fit any designated storage (e.g. storage tiering) and / or security requirements for the application and the data it handles. In the following exercises, we will examine the emptyDir volume.

The main consideration to note about using the emptyDir volume for an application is that the volume lifecycle is tied to the pod lifecycle. If the pod were to crash or get recreated, then any data managed by the containers within the pod will be deleted, effectively being unrecoverable. The primary use case for an emptyDir volume then is if the application within the containers require some data durability (e.g. scratch space, checkpointing a computation, etc.) for surviving container restarts but not necessarily surviving pod restarts.

Creating an emptyDir volume

Step 1: Define the following multi-container pod manifest in the file emptydir-pod.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: mc1
spec:
  volumes:
  - name: html
    emptyDir: {}
  containers:
  - name: 1st
    image: r.deso.tech/library/nginx
    volumeMounts:
    - name: html
      mountPath: /usr/share/nginx/html
  - name: 2nd
    image: r.deso.tech/library/alpine
    volumeMounts:
    - name: html
      mountPath: /html
    command: ["/bin/sh", "-c"]
    args:
    - while true; do
        date >> /html/index.html;
        sleep 10;
      done

Take a moment to review this pod manifest. In it, we define a volume named html. The volume type is emptyDir, which means that the volume is first created when a pod is assigned to a node, and exists as long as that pod is running on that node. As the name states, the volume is initially empty. The 1st container runs an NGINX server and has the shared volume mounted to the directory /usr/share/nginx/html. The 2nd container uses the alpine container image (a lightweight Linux distribution) and has the shared volume mounted to the directory /html. Every ten seconds, the 2nd container adds the current date and time to index.html, which is located in the shared volume. When a user makes an HTTP request to the Pod, the NGINX server reads this file and transfers it back to the user in response to the request.

Step 2: Apply the YAML file and create the mc1 pod:

kubectl create -f emptydir-pod.yaml
pod/mc1 created

Step 3: With our pod created, we want to ensure that both containers have access to the data in the shared volume:

kubectl exec mc1 -c 1st -- cat /usr/share/nginx/html/index.html
....
Thu Jul 23 20:47:41 UTC 2020
Thu Jul 23 20:47:51 UTC 2020
....
kubectl exec mc1 -c 2nd -- cat /html/index.html
....
Thu Jul 23 20:47:41 UTC 2020
Thu Jul 23 20:47:51 UTC 2020
....

Per the output here, we can conclude that both containers have access to the index.html shared volume being generated by the 2nd container. We can also check that the index.html is being served correctly by accessing the pod directly. We will do that in the next step.

Step 4: Determine which node is running the mc1 pod:

kubectl get pod mc1 -o wide
NAME   READY   STATUS    ...       NODE    ...
mc1    2/2     Running   ...       node2   ...

From the output, we can see that the mc1 pod is running on node2, so we will continue accordingly. It could be that your mc1 pod is running on node1, so continue using node1 if that is the case.

Step 6: SSH to the node running the mc1 pod and determine the names of containers (names highlighted in green):

sudo docker ps
CONTAINER ID ... NAMES
a0a5e81e4b39     @g@k8s_2nd_mc1_default_b38bb..._0@/g@
3f97aa51b15a     @g@k8s_1st_mc1_default_b38bb..._0@/g@

Although the output is abbreviated here, we can see the naming convention used by Kubernetes for our containers (i.e. k8s_nameOfContainer_nameOfPod_namespace_uniquePodID…). With the names of our containers identified, we are ready to move on to the next step of deleting these containers. When we do so, kubelet on the node will recreate them, while still preserving the pod’s state including our pod volume. kubelet will restart these containers in the same pod as the RestartPolicy is set to the default, Always.

Step 7: On node running the mc1 pod, delete the mc1 containers:

sudo docker rm -f k8s_1st_mc1_default_..._0 k8s_2nd_mc1_default_..._0
k8s_1st_mc1_default_..._0
k8s_2nd_mc1_default_..._0

After issuing the command to delete the containers in the mc1 pod, kubelet will recreate them. We can confirm this by issuing the docker ps command from step 6 and look for similarly named containers (e.g. k8s_1st_mc1_default_b38bb..._0). When ready, go back to the master tab, and move on to the next step.

Step 8: Confirm that the data in the index.html file remains intact after deleting the initial containers:

kubectl exec mc1 -c 2nd -- cat /html/index.html

As we can see, the data has persisted.

Step 9: Delete the mc1 pod:

kubectl delete pod mc1
pod "mc1" deleted