With the introduction of the snapshot-controller in Kubernetes, it is now possible to create snapshots for CSI drivers and cloud providers that support this feature.
The API is universal and vendor-independent, which is typical for Kubernetes, so we can explore it without getting into the specifics of a particular implementation. Let’s take a closer look at snapshots and see how they can benefit Kubernetes users.
Introduction
First, let’s clarify what snapshots are. A snapshot is the state of a file system at a particular point in time. You can save it and use it later to restore that specific state. The process of creating a snapshot is almost instantaneous. Once a snapshot is created, all changes to the original file system are written to different blocks.
Since snapshot data is stored in the same place as the original data, snapshots are no substitute for a backup. At the same time, backups based on a snapshot rather than live data are more consistent. This is because all data is guaranteed to be up-to-date when the snapshot is created.
A snapshot-controller (a universal component for all CSI drivers) must be installed, and the following CRDs must be defined in the Kubernetes cluster for the snapshot feature to work:
VolumeSnapshotClass
– the equivalent ofStorageClass
for snapshots;VolumeSnapshotContent
– the equivalent of PV for snapshots;VolumeSnapshot
– the equivalent of PVC for snapshots.
On top of that, the CSI driver must support snapshot creation and have relevant csi-snapshotter
controller.
How do snapshots work in Kubernetes?
The logic behind their operation is simple. There are several entities; the VolumeSnapshotClass
describes the snapshot-creating parameters, such as the CSI driver. You can also specify additional settings there, for example, whether the snapshots should be incremental and where they should be stored.
When creating a VolumeSnapshot
, you must specify the PersistentVolumeClaim
for which the snapshot will be created.
When taking a snapshot, the CSI driver creates a VolumeSnapshotContent
resource in the cluster and sets its parameters (usually the resource ID).
Next, the snapshot-controller binds VolumeSnapshot
to VolumeSnapshotContent
(just like with PV and PVC).
When creating a new PersistentVolume
, you can set the previously created VolumeSnapshot
as a dataSource
to use its data.
Configuration
VolumeSnapshotClass
allows you to specify various VolumeSnapshot
attributes, such as the CSI driver name and additional cloud provider/data storage-related parameters. Provided below are the links to several examples of VolumeSnapshotClass
resource definitions:
Once the VolumeSnapshotClass
has been created, you can start taking snapshots. Let’s take a look at some typical use cases.
Case No.1: PVC templates
Suppose we want to have some PVC template containing data and clone it whenever we need. This might come in handy in the following cases:
- in quickly creating development environments with data;
- in simultaneously processing data using multiple Pods on different nodes.
The magic behind this is to create a standard PVC, fill it with the data you want, and then create another PVC with the original one set as its source:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-worker1
spec:
storageClassName: linstor-ssd-lvmthin-r2
dataSource:
name: pvc-template
kind: PersistentVolumeClaim
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
You will get a complete clone of the original PVC with all the data, which you can use right away. The snapshot mechanism is completely transparent here, so we didn’t even have to use any of the resources described above.
Case No. 2: Snapshots for testing
This case shows how you can safely model a database migration on live data without interfering with production.
We have to clone an existing PVC that our application uses (just like in the example above) as well as the new app version with the cloned PVC to test the upgrade. In the event that you encounter a problem, you can create a new clone and try again.
When testing is complete, the new version of the application can be deployed to production. But first, create a mypvc-before-upgrade
snapshot so you can always revert to the pre-upgrade state. Snapshots are created using the VolumeSnapshots
resource. In it, specify the target PVC to create a snapshot:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: mypvc-before-upgrade
spec:
volumeSnapshotClassName: linstor
source:
persistentVolumeClaimName: mypvc
After switching to the new version, you can always revert to the pre-upgrade state by specifying the mypvc-before-upgrade
snapshot as the PVC source:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mypvc
spec:
storageClassName: linstor-ssd-lvmthin-r2
dataSource:
name: mypvc-before-upgrade
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Case No. 3: Using snapshots to do consistent backups
Snapshots are integral to creating consistent backups in a running environment. Without them, there is no way to make a PVC backup of the PVC without first pausing the application.
If you attempt to copy the entire volume while the application is running, there is a high probability that some of its sections will be overwritten. To avoid this, you can take a snapshot and use it for backup.
There are various tools for backing up in Kubernetes available that respect the logic of your application and/or use the snapshot mechanism. One of those tools, Velero, allows you to automate snapshot usage, schedule additional hooks to reset data to disk, and suspend/resume the application for better backup consistency.
At the same time, some vendors provide built-in backup functionality. For example, LINSTOR allows you to upload snapshots to a remote S3 server automatically and supports both full and incremental backups.
In order to benefit from this feature, you will need to create a dedicated VolumeSnapshotClass
containing all the necessary parameters for accessing the remote S3 server:
---
kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1
metadata:
name: linstor-minio
driver: linstor.csi.linbit.com
deletionPolicy: Retain
parameters:
snap.linstor.csi.linbit.com/type: S3
snap.linstor.csi.linbit.com/remote-name: minio
snap.linstor.csi.linbit.com/allow-incremental: "false"
snap.linstor.csi.linbit.com/s3-bucket: foo
snap.linstor.csi.linbit.com/s3-endpoint: XX.XXX.XX.XXX.nip.io
snap.linstor.csi.linbit.com/s3-signing-region: minio
snap.linstor.csi.linbit.com/s3-use-path-style: "true"
csi.storage.k8s.io/snapshotter-secret-name: linstor-minio
csi.storage.k8s.io/snapshotter-secret-namespace: minio
---
kind: Secret
apiVersion: v1
metadata:
name: linstor-minio
namespace: minio
immutable: true
type: linstor.csi.linbit.com/s3-credentials.v1
stringData:
access-key: minio
secret-key: minio123
The newly created snapshots will now be pushed to the remote S3 server:
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: mydb-backup1
spec:
volumeSnapshotClassName: linstor-minio
source:
persistentVolumeClaimName: db-data
The interesting thing is that you can use them in a different Kubernetes cluster. To do so, you will have to define VolumeSnapshotContent
and VolumeSnapshot
in addition to VolumeSnapshotClass
:
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotContent
metadata:
name: example-backup-from-s3
spec:
deletionPolicy: Delete
driver: linstor.csi.linbit.com
source:
snapshotHandle: snapshot-0a829b3f-9e4a-4c4e-849b-2a22c4a3449a
volumeSnapshotClassName: linstor-minio
volumeSnapshotRef:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
name: example-backup-from-s3
namespace: new-cluster
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: example-backup-from-s3
spec:
source:
volumeSnapshotContentName: example-backup-from-s3
volumeSnapshotClassName: linstor-minio
Note that you will have to specify the storage system’s snapshot ID in VolumeSnapshotContent
by passing it via the snapshotHandle
parameter.
Now you can create a new PVC using the backup snapshot as the data source:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: restored-data
namespace: new-cluster
spec:
storageClassName: linstor-ssd-lvmthin-r2
dataSource:
name: example-backup-from-s3
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Conclusion
With snapshots, you can make more efficient use of your storage solution by creating consistent backups and cloning volumes. They also allow you to avoid having to duplicate your data when this is not necessary. Here’s to snapshots making your life easier and better!
Comments