Migrate Data Volumes to New Clusters
There are certain cases where you may want to migrate existing volumes to a new cluster. If so, read on for an in depth look at the steps required.
Configure your PostgresCluster CRD
In order to use existing pgData, pg_wal or pgBackRest repo volumes in a new PostgresCluster, you will need to configure the spec.dataSource.volumes
section of your PostgresCluster CRD. As shown below, there are three possible volumes you may configure, pgDataVolume
, pgWALVolume
and pgBackRestVolume
. Under each, you must define the PVC name to use in the new cluster. A directory may also be defined, as needed, for cases where the existing directory name does not match the v5 directory.
To help explain how these fields are used, we will consider a pgcluster
from PGO v4, oldhippo
. We will assume that the pgcluster
has been deleted and only the PVCs have been left in place.
Please note that any differences in configuration or other datasources will alter this procedure significantly and that certain storage options require additional steps (see Considerations below)!
In a standard PGO v4.7 cluster, a primary database pod with a separate pg_wal PVC will mount its pgData PVC, named “oldhippo”, at /pgdata
and its pg_wal PVC, named “oldhippo-wal”, at /pgwal
within the pod’s file system. In this pod, the standard pgData directory will be /pgdata/oldhippo
and the standard pg_wal directory will be /pgwal/oldhippo-wal
. The pgBackRest repo pod will mount its PVC at /backrestrepo
and the repo directory will be /backrestrepo/oldhippo-backrest-shared-repo
.
With the above in mind, we need to reference the three PVCs we wish to migrate in the dataSource.volumes
portion of the PostgresCluster spec. Additionally, to accommodate the PGO v5 file structure, we must also reference the pgData and pgBackRest repo directories. Note that the pg_wal directory does not need to be moved when migrating from v4 to v5!
Now, we just need to populate our CRD with the information described above:
spec:
dataSource:
volumes:
pgDataVolume:
pvcName: oldhippo
directory: oldhippo
pgWALVolume:
pvcName: oldhippo-wal
pgBackRestVolume:
pvcName: oldhippo-pgbr-repo
directory: oldhippo-backrest-shared-repo
Lastly, it is very important that the PostgreSQL version and storage configuration in your PostgresCluster match exactly the existing volumes being used.
If the volumes were used with PostgreSQL 13, the spec.postgresVersion
value should be 13
and the associated spec.image
value should refer to a PostgreSQL 13 image.
Similarly, the configured data volume definitions in your PostgresCluster spec should match your existing volumes. For example, if the existing pgData PVC has a RWO access mode and is 1 Gigabyte, the relevant dataVolumeClaimSpec
should be configured as
dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1G
With the above configuration in place, your existing PVC will be used when creating your PostgresCluster. They will be given appropriate Labels and ownership references, and the necessary directory updates will be made so that your cluster is able to find the existing directories.
Considerations
Removing PGO v4 labels
When migrating data volumes from v4 to v5, PGO relabels all volumes for PGO v5, but will not remove existing PGO v4 labels. This results in PVCs that are labeled for both PGO v4 and v5, which can lead to unintended behavior.
To avoid that, you must manually remove the pg-cluster
and vendor
labels, which you can do with a kubectl
command. For instance, given a cluster named hippo
with a dedicated pgBackRest repo, the PVC will be hippo-pgbr-repo
, and the PGO v4 labels can be removed with the below command:
kubectl label pvc hippo-pgbr-repo \
pg-cluster- \
vendor-
Proper file permissions for certain storage options
Additional steps are required to set proper file permissions when using certain storage options, such as NFS and HostPath storage due to a known issue with how fsGroups are applied.
When migrating from PGO v4, this will require the user to manually set the group value of the pgBackRest repo directory, and all subdirectories, to 26
to match the postgres
group used in PGO v5. Please see here for more information.
Additional Considerations
- An existing pg_wal volume is not required when the pg_wal directory is located on the same PVC as the pgData directory.
- When using existing pg_wal volumes, an existing pgData volume must also be defined to ensure consistent naming and proper bootstrapping.
- When migrating from PGO v4 volumes, it is recommended to use the most recently available version of PGO v4.
- As there are many factors that may impact this procedure, it is strongly recommended that a test run be completed beforehand to ensure successful operation.
Putting it all together
Now that we’ve identified all of our volumes and required directories, we’re ready to create our new cluster!
Below is a complete PostgresCluster that includes everything we’ve talked about. After your PostgresCluster
is created, you should remove the spec.dataSource.volumes
section.
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: oldhippo
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.9-0
postgresVersion: 14
dataSource:
volumes:
pgDataVolume:
pvcName: oldhippo
directory: oldhippo
pgWALVolume:
pvcName: oldhippo-wal
pgBackRestVolume:
pvcName: oldhippo-pgbr-repo
directory: oldhippo-backrest-shared-repo
instances:
- name: instance1
dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1G
walVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1G
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.47-0
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1G