Administrative Tasks

Manually Restarting PostgreSQL

There are times when you might need to manually restart PostgreSQL. This can be done by adding or updating a custom annotation to the cluster's spec.metadata.annotations section. PGO will notice the change and perform a rolling restart.

For example, if you have a cluster named hippo in the namespace postgres-operator, all you need to do is patch the hippo PostgresCluster. In Bash, you can use the following:

kubectl patch postgrescluster/hippo -n postgres-operator --type merge --patch '{"spec":{"metadata":{"annotations":{"restarted":"'"$(date)"'"}}}}'

In Powershell, you would use:

kubectl patch postgresclusters/hippo -n postgres-operator --type merge --patch '{\"spec\":{\"metadata\":{\"annotations\":{\"restarted\":\"$(date)\"}}}}'

Watch your hippo cluster: you will see the rolling update has been triggered and the restart has begun.

Shutdown

You can shut down a Postgres cluster by setting the spec.shutdown attribute to true. You can do this by editing the manifest, or, in the case of the hippo cluster, executing a command like the below:

kubectl patch postgrescluster/hippo -n postgres-operator --type merge --patch '{"spec":{"shutdown": true}}'

In Powershell, you would execute:

kubectl patch postgrescluster/hippo -n postgres-operator --type merge --patch '{\"spec\":{\"shutdown\": true}}'

The effect of this is that all the Kubernetes workloads for this cluster are scaled to 0. You can verify this with the following command:

kubectl get deploy,sts,cronjob --selector=postgres-operator.crunchydata.com/cluster=hippo

NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/hippo-pgbouncer   0/0     0            0           1h

NAME                             READY   AGE
statefulset.apps/hippo-00-lwgx   0/0     1h

NAME                             SCHEDULE   SUSPEND   ACTIVE
cronjob.batch/hippo-repo1-full   @daily     True      0

To turn a Postgres cluster that is shut down back on, you can set spec.shutdown to false.

Pausing Reconciliation and Rollout

You can pause the Postgres cluster reconciliation process by setting the spec.paused attribute to true. You can do this by editing the manifest, or, in the case of the hippo cluster, executing a command like the below:

kubectl patch postgrescluster/hippo -n postgres-operator --type merge --patch '{"spec":{"paused": true}}'

In Powershell environments, you would execute:

kubectl patch postgrescluster/hippo -n postgres-operator --type merge --patch '{\"spec\":{\"paused\": true}}'

Pausing a cluster will suspend any changes to the cluster’s current state until reconciliation is resumed. This allows you to fully control when changes to the PostgresCluster spec are rolled out to the Postgres cluster. While paused, no statuses are updated other than the "Progressing" condition.

To resume reconciliation of a Postgres cluster, you can either set spec.paused to false or remove the setting from your manifest.

Rotating TLS Certificates

Credentials should be invalidated and replaced (rotated) as often as possible to minimize the risk of their misuse. Unlike passwords, every TLS certificate has an expiration, so replacing them is inevitable.

In fact, PGO automatically rotates the client certificates that it manages before the expiration date on the certificate. A new client certificate will be generated after 2/3rds of its working duration; so, for instance, a PGO-created certificate with an expiration date 12 months in the future will be replaced by PGO around the eight month mark. This is done so that you do not have to worry about running into problems or interruptions of service with an expired certificate.

Triggering a Certificate Rotation

If you want to rotate a single client certificate, you can regenerate the certificate of an existing cluster by deleting the tls.key field from its certificate Secret.

Is it time to rotate your PGO root certificate? All you need to do is delete the pgo-root-cacert secret. PGO will regenerate it and roll it out seamlessly, ensuring your apps continue communicating with the Postgres cluster without having to update any configuration or deal with any downtime.

kubectl delete secret pgo-root-cacert

Info

PGO only updates secrets containing the generated root certificate. It does not touch custom certificates.

Rotating Custom TLS Certificates

When you use your own TLS certificates with PGO, you are responsible for replacing them appropriately. Here's how.

PGO automatically detects and loads changes to the contents of PostgreSQL server and replication Secrets without downtime. You or your certificate manager need only replace the values in the Secret referenced by spec.customTLSSecret.

If instead you change spec.customTLSSecret to refer to a new Secret or new fields, PGO will perform a rolling restart.

Info

When changing the PostgreSQL certificate authority, make sure to update customReplicationTLSSecret as well.

PGO automatically notifies PgBouncer when there are changes to the contents of PgBouncer certificate Secrets. Recent PgBouncer versions load those changes without downtime, but versions prior to 1.16.0 need to be restarted manually. There are a few ways to restart an older version PgBouncer to reload Secrets:

Store the new certificates in a new Secret. Edit the PostgresCluster object to refer to the new Secret, and PGO will perform a rolling restart of PgBouncer.
```
spec:
  proxy:
    pgBouncer:
      customTLSSecret:
        name: hippo.pgbouncer.new.tls
```

Replace the old certificates in the current Secret. PGO doesn't notice when the contents of your Secret change, so you need to trigger a rolling restart of PgBouncer. Edit the PostgresCluster object to add a unique annotation. The name and value are up to you, so long as the value differs from the previous value.
```
spec:
  proxy:
    pgBouncer:
      metadata:
        annotations:
          restarted: Q1-certs
```

This kubectl patch command uses your local date and time. In Bash:

kubectl patch postgrescluster/hippo -n postgres-operator --type merge --patch '{"spec":{"proxy":{"pgBouncer":{"metadata":{"annotations":{"restarted":"'"$(date)"'"}}}}}}'

In Powershell:

kubectl patch postgrescluster/hippo -n postgres-operator --type merge --patch '{\"spec\":{\"proxy\":{\"pgBouncer\":{\"metadata\":{\"annotations\":{\"restarted\":\"$(date)\"}}}}}}'

Changing the Primary

There may be times when you want to change the primary in your HA cluster. This can be done using the patroni.switchover section of the PostgresCluster spec. It allows you to enable switchovers in your PostgresClusters, target a specific instance as the new primary, and run a failover if your PostgresCluster has entered a bad state.

Let's go through the process of performing a switchover!

First you need to update your spec to prepare your cluster to change the primary. Edit your spec to have the following fields:

spec:
  patroni:
    switchover:
      enabled: true

After you apply this change, PGO will be looking for the trigger to perform a switchover in your cluster. You will trigger the switchover by adding the postgres-operator.crunchydata.com/trigger-switchover annotation to your custom resource. The best way to set this annotation is with a timestamp, so you know when you initiated the change.

For example, for our hippo cluster, we can run the following command to trigger the switchover:

kubectl annotate -n postgres-operator postgrescluster hippo postgres-operator.crunchydata.com/trigger-switchover="$(date)"

Hint

If you want to perform another switchover you can re-run the annotation command and add the --overwrite flag:

kubectl annotate -n postgres-operator postgrescluster hippo --overwrite postgres-operator.crunchydata.com/trigger-switchover="$(date)"

PGO will detect this annotation and use the Patroni API to request a change to the current primary!

The roles on your database instance Pods will start changing as Patroni works. The new primary will have the master role label, and the old primary will be updated to replica.

The status of the switch will be tracked using the status.patroni.switchover field. This will be set to the value defined in your trigger annotation. If you use a timestamp as the annotation this is another way to determine when the switchover was requested.

After the instance Pod labels have been updated and status.patroni.switchover has been set, the primary has been changed on your cluster!

Info

After changing the primary, we recommend that you disable switchovers by setting spec.patroni.switchover.enabled to false or remove the field from your spec entirely. If the field is removed the corresponding status will also be removed from the PostgresCluster.

Targeting an instance

Another option you have when switching the primary is providing a target instance as the new primary. This target instance will be used as the candidate when performing the switchover. The spec.patroni.switchover.targetInstance field takes the name of the instance that you are switching to.

This name can be found in a couple different places; one is as the name of the StatefulSet and another is on the database Pod as the postgres-operator.crunchydata.com/instance label. The following commands can help you determine who is the current primary and what name to use as the targetInstance:

kubectl get pods -l postgres-operator.crunchydata.com/cluster=hippo -L postgres-operator.crunchydata.com/instance -L postgres-operator.crunchydata.com/role

NAME                      READY   STATUS      RESTARTS   AGE     INSTANCE               ROLE
hippo-instance1-jdb5-0    3/3     Running     0          2m47s   hippo-instance1-jdb5   master
hippo-instance1-wm5p-0    3/3     Running     0          2m47s   hippo-instance1-wm5p   replica

In our example cluster hippo-instance1-jdb5 is currently the primary meaning we want to target hippo-instance1-wm5p in the switchover. Now that you know which instance is currently the primary and how to find your targetInstance, let's update your cluster spec:

spec:
  patroni:
    switchover:
      enabled: true
      targetInstance: hippo-instance1-wm5p

After applying this change you will once again need to trigger the switchover by annotating the PostgresCluster (see above commands). You can verify the switchover has completed by checking the Pod role labels and status.patroni.switchover.

Failover

Finally, we have the option to failover when your cluster has entered an unhealthy state. The only spec change necessary to accomplish this is updating the spec.patroni.switchover.type field to the Failover type. One note with this is that a targetInstance is required when performing a failover. Based on the example cluster above, assuming hippo-instance1-wm5p is still a replica, we can update the spec:

spec:
  patroni:
    switchover:
      enabled: true
      targetInstance: hippo-instance1-wm5p
      type: Failover

Apply this spec change and your PostgresCluster will be prepared to perform the failover. Again you will need to trigger the switchover by annotating the PostgresCluster (see above commands) and verify that the Pod role labels and status.patroni.switchover are updated accordingly.

Warning

Errors encountered in the switchover process can leave your cluster in a bad state. If you encounter issues, found in the operator logs, you can update the spec to fix the issues and apply the change. Once the change has been applied, PGO will attempt to perform the switchover again.

Configuring Logging Destination

If you need to customize your logging destination for your PostgresCluster's different components, this section will help guide you in what to consider and how to set that configuration.

Note: You can both customize your logging destination and enable CPK's OpenTelemetry Logging solution.

Postgres

default: /pgdata/logs/postgres

To set the Postgres logging destination, you'll want to set the log_directory value on spec.config.parameters. (This field is where all Postgres config parameters should be set. For more about custom configuration, check out our guide to customizing Postgres instances.)

apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
spec:
  config:
    parameters:
      log_directory: path/to/directory

The parameters set in spec.config.parameters will be applied to all Postgres instance sets. So when setting a log_directory parameter, be sure that each Postgres instance has writeable access to the directory specified.

Info

When using PostgresCluster v1, the API validates that the log_directory setting exists for each Postgres instance and rejects specs where that is not the case.

The v1 API also rejects some log_directory values that could disrupt Postgres.

The v1beta1 API accepts values for log_directory that could disrupt Postgres, but substitutes a safe alternative before Postgres starts. This is reported as an InvalidParameter event on the PostgresCluster.

As an example, if you wanted to send your Postgres logs to an additional attached volume, your spec might look something like this:

kind: PostgresCluster
spec:
  config:
    parameters:
      log_directory: /volumes/logging/postgres
  instances:
    - name: instance1
      volumes:
        additional:
          - name: logging
            ...      
    - name: instance2
      volumes:
        additional:
          - name: logging
            ...

If one of those instances was missing the volume, that spec would be rejected by the PostgresCluster v1 API.

PgBouncer

default: does not log to file

As discussed in our guide to customizing PgBouncer, PgBouncer settings can be set through the spec.proxy.pgBouncer.config.global field; or included as a file in the spec.proxy.pgBouncer.config.files field.

To set the logging destination for PgBouncer, you want to set the logfile parameter. Unlike Postgres, PgBouncer's log destination parameter doesn't take a directory, but rather a full file path. For instance, the following is a valid spec:

kind: PostgresCluster
spec:
  proxy:
    pgBouncer:
      config:
        global:
          logfile: /tmp/logs/pgbouncer/pgbouncer.log

The logfile value must end in .log. In addition, when using the PostgresCluster v1 API, the logfile destination is validated on admission. PgBouncer will not start if it cannot write to a log, so having a valid log destination is crucial to healthy performance.

Since PgBouncer does not have any persistent volumes attached by default, the only valid values are /tmp/logs/pgbouncer/<YOUR FILENAME>.log or a path that includes an additional volume that is attached to PgBouncer. For instance, the following is a valid spec because the logfile refers to an attached additional volume:

kind: PostgresCluster
spec:
  proxy:
    pgBouncer:
      config:
        global:
          logfile: /volumes/logging/pgbouncer.log
      volumes:
        additional:
          - name: logging
            ...

Info

If OpenTelemetry is enabled, we default to setting PgBouncer to log to /tmp/pgbouncer.log as noted in the OTel logging guide.

You may bypass CPK API validation by setting the logfile parameter in a config file referenced in spec.proxy.pgBouncer.config.files. In that case, OTel and logrotate may not function correctly, regardless of whether PgBouncer is able to write to that destination specified in the file.

pgBackRest

A PostgresCluster could have pgBackRest as part of several different components, including the repohost pod, the pgBackRest sidecar in the Postgres instance, or the backup job pods.

Each component has its own log.path field that controls where pgBackRest logs. For example, here it is specified for every component:

kind: PostgresCluster
spec:
  backups:
    pgbackrest:
      jobs:
        log:
          path: /job/log/path ①
      log:
        path: /instance-sidecar/log/path ②
      repoHost:
        log:
          path: /repoHost/log/path ③

① Setting the log path for cloud-based backups: spec.backups.pgbackrest.jobs.log.path default: does not log to file

② Setting the log path for pgBackRest running on the Postgres instance: spec.backups.pgbackrest.log.path default: /pgdata/pgbackrest/log

③ Setting the log path for the repoHost: spec.backups.pgbackrest.repoHost.log.path default: /pgbackrest/repoN/log where N is the number of the first volume repo in the repo array

Note: For local volume-based backups (as opposed to cloud-based backups), the backup process runs on the repo host, so the location of the logs will be dictated by the value at ③.

For instance, if you are using an additional volume for your backup jobs and wanted to log to that volume, you could with the following spec that has both an additional volume specified and a log.path pointing to a directory in that volume:

kind: PostgresCluster
spec:
  backups:
    pgbackrest:
      jobs:
        volumes:
          additional:
            - name: logging
              ...
        log:
          path: /volumes/logging/backup

Next Steps

We've covered a lot in terms of building, maintaining, scaling, customizing, and restarting our Postgres cluster. However, there may come a time where we need to resize our Postgres cluster. How do we do that?