Optional Backups
Info
FEATURE AVAILABILITY: Available in v5.7.0 and above
Because Crunchy Postgres for Kubernetes (CPK) was originally designed for production use, disaster recovery was built-in from day one. This was achieved largely through required backups.
However, there are use-cases where you may not want backups. For instance, you might want to start up a temporary PostgresCluster
for testing purposes and not want to dedicate resources to backups.
For this use-case and others, CPK v5.7+ allows backups to be turned on or off for each PostgresCluster
.
Running without backups: a few considerations
Running a PostgresCluster
without backups means some features are no longer available.
First, and most importantly: without backups, there is no practical recovery mechanism. If you run a cluster with backups and accidentally drop an important table, you can restore an older backup and recover that table. If you don't have backups, you don't have that recovery option. For this reason, we really do not recommend running a cluster without backups outside of a few use-cases (temporary test clusters, etc.).
Second, for replicas, a PostgresCluster
without backups will use pg_basebackup
to initially create the replica and stream additional changes from the primary. Because of this, when starting a replica, it may speed up the process to run checkpoint
on the primary first.
Third, you cannot clone a cluster with no backups, since cloning relies on backups. But you can still delete a cluster and retain the pgdata volume and re-use that volume as described in our Data Migration guide.
Fourth, when setting up a standby cluster, you cannot use any repo-based streaming, but you can stream from the primary as described in our streaming tutorial.
Fifth, when monitoring a PostgresCluster
without backups, the pgbackrest
-related metrics will be blank, as expected.
Optional Backups: a user guide
Starting a PostgresCluster without backups
With CPK v5.7+, nothing has changed about starting a cluster with backups: you need to have a defined spec.backups
section in your cluster spec.
In order to start a cluster without backups, you can simply remove the spec.backups
section.
The spec.backups
section used to be required, and if you are running CPK v5.6 or older, you will get an error from the Kubernetes API saying that the spec is invalid.
However, if you are running CPK v5.7+, a PostgresCluster
without a spec.backups
field is valid, and will result in a PostgresCluster
being created without backups.
Turning on backups
In order to turn on backups when a cluster doesn't have them, you simply need to fill in the spec.backups
section with your requirements.
To learn more about backup options, see our tutorial on configuring backups for your Postgres cluster.
Once the spec.backups
section is filled in, CPK will start reconciling the required Kubernetes objects for regular backups.
Turning off backups
Starting a cluster without backups only requires that you remove or leave blank the spec.backups
section. But turning off backups requires an additional annotation be added to the PostgresCluster
.
Why? Because turning off backups means removing that backup data; and acts that remove data require additional confirmation.
In this case, to confirm that you want your backups removed, add this annotation to your cluster:
postgres-operator.crunchydata.com/authorizeBackupRemoval="true"
A sample command to add this annotation is
kubectl annotate postgrescluster \<CLUSTER_NAME\> postgres-operator.crunchydata.com/authorizeBackupRemoval="true"
Adding that annotation to your cluster will remove the backups and all associated Kubernetes objects: the PersistentVolume
that held the data, the StatefulSet
that represented the repo-host, the RBAC Kubernetes objects that allowed the expected access, etc.
Note: CPK will only remove Kubernetes-local data. If you are using cloud-based backups for a PostgresCluster
and you turn off backups for that cluster, CPK will stop backing up to the cloud--but we do not remove cloud-based backups. You are responsible for cleaning the, e.g., S3 buckets in that case if you want to remove them.
If you remove the spec.backups
section from a cluster that previously had backups BUT have not yet added the annotation, CPK will pause reconciling that cluster. You can check for this in the cluster status, which will have a message saying that CPK has paused progess on that cluster because the annotation is missing. At this point, you can either add the annotation to remove backups or re-add the spec.backups
section.
Note: After the backups are removed, it is a best practice to remove the annotation. That way, if you turn on and then off backups at a later date, you will have the opportunity to confirm that you want the backups removed.
How we achieve this
In order to make backups optional, we made two changes to the operator and the PostgresCluster
CRD:
- We made the
spec.backups
section optional in the CRD. - CPK now manages the
archive_command
depending on whether thespec.backups
section is present.
By making spec.backups
optional, CPK can now add or remove the Kubernetes objects related to backups, just like CPK does with monitoring or other features. (That said, see Turning off backups above for the case where CPK requires additional confirmation to reconcile and remove Kubernetes objects.)
If spec.backups
is present, CPK sets the archive_command
to the usual pgbackrest
command that we use to archive backups. But if spec.backups
is not present, CPK sets the archive_command
to a command that automatically returns true. Since Postgres will attempt to archive the backup as usual and then drop the backup if it receives a true command, this means that Postgres will drop those backups as soon as they are archived.
We made the decision to change archiving behavior through setting the archive_command
since this setting can be changed without restarting the Postgres process. For more on archive_command
, see the Postgres docs.