Additional Volumes

Info

FEATURE AVAILABILITY: Available in v6.0.0 and above

The volumes.additional field lets you attach existing Persistent Volume Claims to any workload PGO deploys. We will go into detail below about adding PVCs to the different workloads (Postgres, pgBackRest, etc.) The volumes.additional field is a list, so multiple PVCs can be attached to multiple directories. Each item in the list has the following fields:

the claimName, which should be the exact name of the Persistent Volume Claim you are providing, and is used to mount the volume;
the name, which is used to specify the mount path for the volume on the /volumes path and needs to be unique;
the containers array, which is an optional list of containers to mount the volume to. The volume can be mounted to any container or initContainer, including containers defined as custom sidecars. If the containers field is left off, the volume will be mounted to all containers in the pod. If the containers field is an empty list, the volume will not be mounted to any containers, but will be available on the pod. See the example below for a demonstration of these options;
the readOnly optional field, which if set to true will mount the volume as a Read Only volume.

For instance, take the following spec:

volumes:
  additional:
    - claimName: pv-claim-1
      name: logging
      containers: [database]
    - claimName: pv-claim-2
      name: secrets
      readOnly: true
    - claimName: pv-claim-3
      name: manual
      containers: []

This spec assumes that the three PVCs (pv-claim-1, pv-claim-2, and pv-claim-3) already exist. This spec requests that the operator

mount pv-claim-1 to the path /volumes/logging on the database container;
mount pv-claim-2 to the path /volumes/secrets as a read only volume on all containers in the pod;
and mount pv-claim-3 to the pod, but not to any containers.

That general form is the same for adding Persistent Volume Claims to Postgres, PgBouncer, and pgBackRest pods. Below we go into details on how to add PVCs to Postgres, PgBouncer, and pgBackRest pods.

Configuring Your PostgresCluster Spec

Here is the general structure of the PostgresCluster spec with all of the various places that the volumes.additional field can be added:

kind: PostgresCluster
spec:
  backups:
    pgbackrest:
      jobs:
        volumes:
          additional: []
      repoHost:
        volumes:
          additional: []
      restore:
        volumes:
          additional: []
  dataSource:
    pgbackrest:
      volumes:
        additional: []
    postgresCluster:
      volumes:
        additional: []
  instances:
    - volumes:
        additional: []
  proxy:
    pgBouncer:
      volumes:
        additional: []

Let's break it down and go through each section one by one.

spec.backups.pgbackrest

This section covers most of the configuration for our cluster's Disaster Recovery, including backups, repo hosts, and in-place restores.

To mount additional volumes to all backup jobs, manual and automated, add the volumes.additional field to spec.backups.pgbackrest.jobs.

To add additional volumes to your repo host, which will be present if you are using a Kubernetes Volume for at least one pgBackRest repository, you would add the volumes.additional field to spec.backups.pgbackrest.repoHost.

To add additional volumes to restore jobs that are doing in-place restores, you would add the volumes.additional field to spec.backups.pgbackrest.restore. Non-in-place restores are discussed in the next section.

spec.dataSource

This section is used when you want to clone a PostgresCluster or do a Point-in-time-Recovery. When either of these procedures is carried out, a new PostgresCluster is created and a restore job is spun up to perform the clone or recovery, respectively. For both procedures, a data source is required to provide the new cluster with the appropriate data. You must choose between using a backup in a cloud-based repo (S3, GCS, or Azure) or an existing PostgresCluster as the data source, hence why there are two separate fields in this section: spec.dataSource.pgbackrest and spec.dataSource.postgresCluster.

If you are using a backup in a cloud-based repo as the data source, and wish to add additional volumes to the restore job, you would add the volumes.additional field to spec.dataSource.pgbackrest.

If you are using an existing PostgresCluster as the data source, and wish to add additional volumes to the restore job, you would add the volumes.additional field to spec.dataSource.postgresCluster.

spec.instances

This section of the spec is used to configure the Postgres pod(s) that holds the database and its various sidecar containers. To add additional volumes to a particular Postgres instance, you would add the volumes.additional field to the spec.instances list. If the instance has multiple replicas, the volumes will be added to each Postgres pod in that instance.

spec.proxy.pgBouncer

This section covers configuration for pgBouncer, the key component in our connection pooling feature. To add additional volumes to your pgBouncer pod, you would add the volumes.additional field to spec.proxy.pgBouncer. If the pgBouncer has multiple replicas, the volumes will be added to each replica pod.

Configuring Your pgAdmin Spec

Here is the general structure of the PGAdmin spec with the volumes.additional field added:

kind: PGAdmin
spec:
  volumes:
    additional: []

spec

To add additional volumes to your pgAdmin pods, you would add the volumes.additional field to spec.

Transitioning from the Cloud Log Volume Annotation

In v5.8.3 and v5.7.7 we introduced a method for persisting the pgBackRest logs from backup jobs that push to cloud-based repos (S3, GCS, or Azure). If you are currently using this method to persist cloud backup logs, but would like to move to the Additional Volumes method for consistency with your other logging volumes, follow the rest of this guide to do so.

Updating the Spec

You will need to add an additional volume entry and a log path under your spec.backups.pgbackrest.jobs section; however, the way in which you set the additional volume's name and claimName and the log.path will be determined by how you want to transition.

If you wish to use the same PVC and log path as before, you should use the name of your PVC as the name and claimName of the additional volume and set the log.path to "/volumes/your-pvc-name". For example, if the PVC used by the annotation method was named "cloud-backup-logs", your spec would look like:

spec:
  backups:
    pgbackrest:
      jobs:
        log:
          path: "/volumes/cloud-backup-logs"
        volumes:
          additional:
          - name: cloud-backup-logs
            claimName: cloud-backup-logs

If you want to use the same PVC, but change the log path, you have two options. If you want to keep the same parent directory, but move the logs to a new subdirectory, you can use the same setup as above, but add a subdirectory to the path:

spec:
  backups:
    pgbackrest:
      jobs:
        log:
          path: "/volumes/cloud-backup-logs/new-log-dir"
        volumes:
          additional:
          - name: cloud-backup-logs
            claimName: cloud-backup-logs

However, if you want to change the parent directory, you will need to use a different name and path. For example:

spec:
  backups:
    pgbackrest:
      jobs:
        log:
          path: "/volumes/new-volume-name/log"
        volumes:
          additional:
          - name: new-volume-name
            claimName: cloud-backup-logs

Note that the path must always start with "/volumes/" and then the name of the desired additional volume. If you go with one of these routes of using the same PVC that was used with the annotation method but changing the log path, the log path change only affects any new logs that are created. Any preexisting logs will not be moved to the new log directory.

Lastly, if you want to switch to a different PVC than what you were using with the annotation method, you should set the claimName to the name of the new PVC and set name and path as you wish. For example:

spec:
  backups:
    pgbackrest:
      jobs:
        log:
          path: "/volumes/my-new-volume/log"
        volumes:
          additional:
          - name: my-new-volume
            claimName: my-new-pvc

Again, this change to using a new PVC only affects newly created logs. Logs on the old PVC will not be moved to the new PVC.

Removing the Annotation

Once you have applied the updated manifest, any new cloud backup jobs will mount volumes and create pgBackRest logs according to the spec; however, we recommend removing the annotation to avoid any confusion around volumes and logging. You can do this with the following command:

kubectl annotate -n postgres-operator postgrescluster hippo postgres-operator.crunchydata.com/pgbackrest-cloud-log-volume-

If you have the additional volume name set the same as the old PVC name and you do not remove the annotation, you will start seeing "DuplicateCloudBackupVolume" warning events on your PostgresCluster; only the additional volume will be mounted into the backup jobs. If the additional volume name is set differently than the old PVC name and you do not remove the annotation, both the additional volume and the PVC specified by the annotation will be mounted into the backup jobs. If for some reason you do not set log.path and the annotation is still present, the annotation method log path will be used rather than defaulting to turning off logging.