Kubernetes and Volumes: Introducing Host and Persistent Volumes (Kubernetes)

In my last blog post I discussed how to create a pod in Kubernetes, in this blog post I hope to take this a little further by adding a volume. There are probably many reasons why you want to add a volume into Kubernetes, such as persisting data from one container within the pod into another or backing up data.

I am going to be using the same set up as my previous blog post, so if you would like to follow along please read this post first and install MiniKube. My original test_pod.yml looked like this:

A Kubernetes pod manifest to create nginx

The test_pod.yml contained entries for apiVersion, kind, metadata (name), and then spec details for the container image, container name and container ports. During my reading / learning about Kubernetes and volumes I started off down a few paths before realising they were not the solution I was looking for. I have noted each here so I may reference them later on. During the writing of this blog post I realised it had started to spiral into a massive post, due to this I’m splitting it into multiple parts. If you want to learn about PersistentVolumeClaims (PVCs) which I touch on a little in this post, then look for part 2.

Generally when a container crashes or is destroyed/recreated it loses it’s file system, so any files created within the container by a process are lost. Also, when launching multiple containers within a pod they (by default) do not share a volume so do not share files between themselves.

hostPath Volume

The hostPath volume was the first type of Kubernetes volume I found out about, it has options for a path and a type.

My revised test_pod.yml contains new entries:

volumes:
Located under spec. This contains the volumes name and the hostPath / path location to where the volume is stored within the Kubernetes host node’s filesystem.
volumeMounts:
Within the containers section is now instructions on where to mount the volume within the pod. In this example the volume mounts into the pod as /usr/share/nginx/html which is where Nginx stores its default web pages.

test_pod.yml now has volumes and mount paths

Running the test_pod.yml and then using the command describe pods nginx shows some the new volume settings in effect.

geektechstuff_k8_vol_2 — describe pods showing the mount

Mounts show the geektechstuffnginx mount I have added, which has read & write (rw) permissions and also shows a secrets mount, which has read only (ro) permissions.

describe pod shows the volume and volume path

If a container in the pod stops or is replaced than the replacement container should have access to the same volume.

If the same pod is is launched on different nodes with different files that the pod may act differently. Also note, if the hostPath does not exist in the pod, then the pod will fail to start.

Secret Volume

One of the mounts above is for secret. The secret volume is used to pass sensitive information into the Pod. I will hopefully be looking at this in the near future.

Persistent Volumes

The persistent volume is a little different to the hostPath volume and has its own API. Well, actually it has two APIs:

PersistentVolume
PersistentVolumeClaim

The Persistent Volume (PV) is classed as a resource, just like a cluster node is. It is separate from the pod / container and as such provisioned separately.

The Persistent Volume Claim (PVC) is similar to a pod in that like a pod use node resources, PVCs use PV resources. The PVC is used by the Pod to create the volume within the Pod.

The way I see it is that:

a) A PersistentVolume (PV) needs to be created. This creates the volume details within the Kubernetes cluster.

b) A PersistentVolumeClaim (PVC) needs to then be created. This tells the cluster what should be accessible on the PV.

c) The Pod manifest then needs to be updated so that it references the PVC.

Note: The capacity numbers in Kubernetes use the SI method. Over the years various ways to describe the large number of bits computers use has arisen. For me to go into here would take several paragraphs, which the kind editors at Wikipedia have already done at https://en.wikipedia.org/wiki/Units_of_information and discussion around resources in Kuberenetes is discussed on this GitHub page.

Creating The Persistent Volume

The details for a PersistentVolume (PV) are written in a YAML (yml) file. I have called my YAML file test_pv.yml.

—

apiVersion: v1

kind: PersistentVolume

metadata:

name: geektechstuffpv

labels:

type: local

spec:

capacity:

# Gi is gibi (binary giga)

storage: 8Gi

# can be Filesystem or Block

volumeMode: Filesystem

accessModes:

– ReadWriteOnce

hostPath:

# where the data is stored in the cluster

path: “/mnt/data”

—

On my first attempt at applying my PV I hit an error saying that the PersistentVolume is invalid and that metadata.name had an invalid value.

The metadata name has to be in DNS format so cannot contain an underscore (_), so I needed to edit the metadata name (i.e. remove the underscore).

My final test_pv.yaml ended up looking like this:

—

apiVersion: v1

kind: PersistentVolume

metadata:

name: geektechstuffpv

labels:

type: local

spec:

capacity:

# Gi is gibi (binary giga)

storage: 8Gi

# can be Filesystem or Block

volumeMode: Filesystem

accessModes:

– ReadWriteOnce

persistentVolumeReclaimPolicy: Recycle

storageClassName: slow

hostPath:

# where the data is stored in the cluster

path: “/mnt/data”

—

To apply the PV use the command:

kubectl apply -f PVFILENAME

So for me this was:

kubectl apply -f test_pv.yml

After correcting my metadata misstep it, the PV applied. To check that the PV applied okay use the command:

kubectl get pv PVMETADATANAME

For me this was:

kubectl get pv geektechstuffpv

Although, kubectl get pv will also work but will return stats on all PVs applied. If you have access to the Kubernetes web dashboard the PV should be viewable in there as well.