Migrate Data Volumes to New Clusters
There are certain cases where you may want to migrate existing volumes to a new cluster. If so, read on for an in depth look at the steps required.
Configure your PostgresCluster CRD
In order to use existing pgData, pg_wal or pgBackRest repo volumes in a new PostgresCluster, you will need to configure the
spec.dataSource.volumes section of your PostgresCluster CRD. As shown below, there are three possible volumes you may configure,
pgBackRestVolume. Under each, you must define the PVC name to use in the new cluster. A directory may also be defined, as needed, for cases where the existing directory name does not match the v5 directory.
To help explain how these fields are used, we will consider a
pgcluster from PGO v4,
oldhippo. We will assume that the
pgcluster has been deleted and only the PVCs have been left in place.
Please note that any differences in configuration or other datasources will alter this procedure significantly and that certain storage options require additional steps (see Considerations below)!
In a standard PGO v4.7 cluster, a primary database pod with a separate pg_wal PVC will mount its pgData PVC, named “oldhippo”, at
/pgdata and its pg_wal PVC, named “oldhippo-wal”, at
/pgwal within the pod’s file system. In this pod, the standard pgData directory will be
/pgdata/oldhippo and the standard pg_wal directory will be
/pgwal/oldhippo-wal. The pgBackRest repo pod will mount its PVC at
/backrestrepo and the repo directory will be
With the above in mind, we need to reference the three PVCs we wish to migrate in the
dataSource.volumes portion of the PostgresCluster spec. Additionally, to accommodate the PGO v5 file structure, we must also reference the pgData and pgBackRest repo directories. Note that the pg_wal directory does not need to be moved when migrating from v4 to v5!
Now, we just need to populate our CRD with the information described above:
spec: dataSource: volumes: pgDataVolume: pvcName: oldhippo directory: oldhippo pgWALVolume: pvcName: oldhippo-wal pgBackRestVolume: pvcName: oldhippo-pgbr-repo directory: oldhippo-backrest-shared-repo
Lastly, it is very important that the PostgreSQL version and storage configuration in your PostgresCluster match exactly the existing volumes being used.
If the volumes were used with PostgreSQL 13, the
spec.postgresVersion value should be
13 and the associated
spec.image value should refer to a PostgreSQL 13 image.
Similarly, the configured data volume definitions in your PostgresCluster spec should match your existing volumes. For example, if the existing pgData PVC has a RWO access mode and is 1 Gigabyte, the relevant
dataVolumeClaimSpec should be configured as
dataVolumeClaimSpec: accessModes: - "ReadWriteOnce" resources: requests: storage: 1G
With the above configuration in place, your existing PVC will be used when creating your PostgresCluster. They will be given appropriate Labels and ownership references, and the necessary directory updates will be made so that your cluster is able to find the existing directories.
Removing PGO v4 labels
When migrating data volumes from v4 to v5, PGO relabels all volumes for PGO v5, but will not remove existing PGO v4 labels. This results in PVCs that are labeled for both PGO v4 and v5, which can lead to unintended behavior.
To avoid that, you must manually remove the
vendor labels, which you can do with a
kubectl command. For instance, given a cluster named
hippo with a dedicated pgBackRest repo, the PVC will be
hippo-pgbr-repo, and the PGO v4 labels can be removed with the below command:
kubectl label pvc hippo-pgbr-repo \ pg-cluster- \ vendor-
Proper file permissions for certain storage options
Additional steps are required to set proper file permissions when using certain storage options, such as NFS and HostPath storage due to a known issue with how fsGroups are applied.
When migrating from PGO v4, this will require the user to manually set the group value of the pgBackRest repo directory, and all subdirectories, to
26 to match the
postgres group used in PGO v5. Please see here for more information.
- An existing pg_wal volume is not required when the pg_wal directory is located on the same PVC as the pgData directory.
- When using existing pg_wal volumes, an existing pgData volume must also be defined to ensure consistent naming and proper bootstrapping.
- When migrating from PGO v4 volumes, it is recommended to use the most recently available version of PGO v4.
- As there are many factors that may impact this procedure, it is strongly recommended that a test run be completed beforehand to ensure successful operation.
Putting it all together
Now that we’ve identified all of our volumes and required directories, we’re ready to create our new cluster!
Below is a complete PostgresCluster that includes everything we’ve talked about. After your
PostgresCluster is created, you should remove the
apiVersion: postgres-operator.crunchydata.com/v1beta1 kind: PostgresCluster metadata: name: oldhippo spec: image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.5-1 postgresVersion: 14 dataSource: volumes: pgDataVolume: pvcName: oldhippo directory: oldhippo pgWALVolume: pvcName: oldhippo-wal pgBackRestVolume: pvcName: oldhippo-pgbr-repo directory: oldhippo-backrest-shared-repo instances: - name: instance1 dataVolumeClaimSpec: accessModes: - "ReadWriteOnce" resources: requests: storage: 1G walVolumeClaimSpec: accessModes: - "ReadWriteOnce" resources: requests: storage: 1G backups: pgbackrest: image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.40-1 repos: - name: repo1 volume: volumeClaimSpec: accessModes: - "ReadWriteOnce" resources: requests: storage: 1G