Crunchy Data announces the release of the PostgreSQL Operator 4.6.0 on January 22, 2021. You can get started with the PostgreSQL Operator with the following commands:
kubectl create namespace pgo kubectl apply -f https://raw.githubusercontent.com/CrunchyData/postgres-operator/v4.6.0/installers/kubectl/postgres-operator.yml
The PostgreSQL Operator is released in conjunction with the Crunchy Container Suite.
The PostgreSQL Operator 4.6.0 release includes the following software versions upgrades:
- pgBackRest is now at version 2.31
- pgnodemx is now at version 1.0.3
- Patroni is now at version 2.0.1
- pgBadger is now at 11.4
The monitoring stack for the PostgreSQL Operator uses upstream components as opposed to repackaging them. These are specified as part of the PostgreSQL Operator Installer. We have tested this release with the following versions of each component:
- Prometheus: 2.24.0
- Grafana: 6.7.5
- Alertmanager: 0.21.0
This release of the PostgreSQL Operator drops support for PostgreSQL 9.5, which goes EOL in February 2021.
PostgreSQL Operator is tested against Kubernetes 1.17 - 1.20, OpenShift 3.11, OpenShift 4.4+, Google Kubernetes Engine (GKE), Amazon EKS, Microsoft AKS, and VMware Enterprise PKS 1.3+, and works on other Kubernetes distributions as well.
During the lifecycle of a PostgreSQL cluster, there are certain events that may require a planned restart, such as an update to a “restart required” PostgreSQL configuration setting (e.g.
shared_buffers) or a change to a Kubernetes Deployment template (e.g. changing the memory request). Restarts can be disruptive in a high availability deployment, which is why many setups employ a “rolling update” strategy (aka a “rolling restart”) to minimize or eliminate downtime during a planned restart.
Because PostgreSQL is a stateful application, a simple rolling restart strategy will not work: PostgreSQL needs to ensure that there is a primary available that can accept reads and writes. This requires following a method that will minimize the amount of downtime when the primary is taken offline for a restart.
This release introduces a mechanism for the PostgreSQL Operator to perform rolling updates implicitly on certain operations that change the Deployment templates and explicitly through the
pgo restart command with the
--rolling flag. Some of the operations that will trigger a rolling update include:
- Memory resource adjustments
- CPU resource adjustments
- Custom annotation changes
- Tablespace additions
- Adding/removing the metrics sidecar to a PostgreSQL cluster
Kubernetes Tolerations can help with the scheduling of Pods to appropriate Nodes based upon the taint values of said Nodes. For example, a Kubernetes administrator may set taints on Nodes to restrict scheduling to just the database workload, and as such, tolerations must be assigned to Pods to ensure they can actually be scheduled on thos nodes.
This release introduces the ability to assign tolerations to PostgreSQL clusters managed by the PostgreSQL Operator. Tolerations can be assigned to every instance in the cluster via the
tolerations attribute on a
pgclusters.crunchydata.com custom resource, or to individual instances using the
tolerations attribute on a
pgreplicas.crunchydata.com custom resource.
pgo create cluster and
pgo scale commands support the
--toleration flag, which can be used to add one or more tolerations to a cluster. Values accepted by the
--toleration flag use the following format:
rule can represent existence (e.g.
key) or equality (
Effect is one of
pgo create cluster hippo \ --toleration=ssd:NoSchedule \ --toleration=zone=east:NoSchedule
Tolerations can also be added and removed from an existing cluster using the
pgo update cluster , command e.g:
pgo update cluster hippo \ --toleration=zone=west:NoSchedule \ --toleration=zone=east:NoSchedule-
or by modifying the
pgclusters.crunchydata.com custom resource directly.
For more information on how tolerations work, please refer to the Kubernetes documentation.
Node Affinity Enhancements
Node affinity has been a feature of the PostgreSQL Operator for a long time but has received some significant improvements in this release.
It is now possible to control the node affinity across an entire PostgreSQL cluster as well as individual PostgreSQL instances from a custom resource attribute on the
pgreplicas.crunchydata.com CRDs. These attributes use the standard Kubernetes specifications for node affinity and should be familiar to users who have had to set this in applications.
Additionally, this release adds support for both “preferred” and “required” node affinity definitions. Previously, one could achieve required node affinity by modifying a template in the
pgo-config ConfigMap, but this release makes this process more straightforward.
This release introduces the
--node-affinity-type flag for the
pgo create cluster,
pgo scale, and
pgo restore commands that allows one to specify the node affinity type for PostgreSQL clusters and instances. The
--node-affinity-type flag accepts values of
preferred (default) and
required. Each instance in a PostgreSQL cluster will inherit its node affinity type from the cluster (
pgo create cluster) itself, but the type of an individual instance (
pgo scale) will supersede that value.
--node-affinity-type must be combined with the
TLS for pgBouncer
Since 4.3.0, the PostgreSQL Operator has had support for TLS connections to PostgreSQL clusters and an improved integration with pgBouncer, used for connection pooling and state management. However, the integration with pgBouncer did not support TLS directly: it could be achieved through modifying the pgBouncer Deployment template.
This release brings TLS support for pgBouncer to the PostgreSQL Operator, allowing for communication over TLS between a client and pgBouncer, and pgBouncer and a PostgreSQL server. In other words, the following is now support:
Client <= TLS =>
pgBouncer <= TLS =>
In other words, to use TLS with pgBouncer, all connections from a client to pgBouncer and from pgBouncer to PostgreSQL must be over TLS. Effectively, this is “TLS only” mode if connecting via pgBouncer.
In order to deploy pgBouncer with TLS, the following preconditions must be met:
- TLS MUST be enabled within the PostgreSQL cluster.
- pgBouncer and the PostgreSQL MUST share the same certificate authority (CA) bundle.
You must have a Kubernetes TLS Secret containing the TLS keypair you would like to use for pgBouncer.
You can enable TLS for pgBouncer using the following commands:
pgo create pgbouncer --tls-secret, where
--tls-secretspecifies the location of the TLS keypair to use for pgBouncer. You must already have TLS enabled in your PostgreSQL cluster.
pgo create cluster --pgbouncer --pgbouncer-tls-secret, where
--tls-secretspecifies the location of the TLS keypair to use for pgBouncer. You must also specify
This adds an attribute to the
pgclusters.crunchydata.com Customer Resource Definition in the
pgBouncer section called
tlsSecret, which will store the name of the TLS secret to use for pgBouncer.
By default, connections coming into pgBouncer have a PostgreSQL SSL mode of
require and connections going into PostgreSQL using
Enable/Disable Metrics Collection for PostgreSQL Cluster
A common case is that one creates a PostgreSQL cluster with the Postgres Operator and forget to enable it for monitoring with the
--metrics flag. Prior to this release, adding the
crunchy-postgres-exporter to an already running PostgreSQL cluster presented challenges.
This release brings the
--disable-metrics introduces to the
pgo update cluster flags that allow for monitoring to be enabled or disabled on an already running PostgreSQL cluster. As this involves modifying Deployment templates, this action triggers a rolling update that is described in the previous section to limit downtime.
Metrics can also be enabled/disabled using the
exporter attribute on the
pgclusters.crunchydata.com custom resource.
This release also changes the management of the PostgreSQL user that is used to collect the metrics. Similar to pgBouncer, the PostgreSQL Operator fully manages the credentials for the metrics collection user. The
--exporter-rotate-password flag on
pgo update cluster can be used to rotate the metric collection user’s credentials.
Container Image Reduction & Reorganization
Advances in Postgres Operator functionality have allowed for a culling of the number of required container images. For example, functionality that had been broken out into individual container images (e.g.
crunchy-pgdump) is now consolidated within the
Renamed container images include:
Removed container images include:
These changes also include overall organization and build performance optimizations around the container suite.
- Metrics collection can now be enabled/disabled using the
pgclusters.crunchydata.com. The previous method to do so, involving a label buried within a custom resource, no longer works.
- pgBadger can now be enabled/disabled using the
pgclusters.crunchydata.com. The previous method to do so, involving a label buried within a custom resource, no longer works.
- Several additional labels on the
pgclusters.crunchydata.comCRD that had driven behavior have been moved to attributes. These include:
autofail, which is now represented by the
service-type, which is now represented by the
NodeLabelValue, which is now replaced by the
backrest-storage-type, which is now represented with the
pgo create clusteris removed and replaced with the
--label, which can be specified multiple times. The API endpoint for
pgo create clusteris also modified: labels must now be passed in as a set of key-value pairs. Please see the “Features” section for more details.
- The API endpoints for
pgo delete labelis modified to accept a set of key/value pairs for the values of the
--labelflag. The API parameter for this is now called
pgo upgradecommand will properly moved any data you have in these labels into the correct attributes. You can read more about how to use the various CRD attributes in the Custom Resources section of the documentation.
usersecretnameattributes on the
pgclusters.crunchydata.comCRD have been removed. Each of these represented managed Secrets. Additionally, if the managed Secrets are not created at cluster creation time, the Operator will now generate these Secrets.
pgclusters.crunchydata.comhas been removed. The Secret for the metrics collection user is now fully managed by the PostgreSQL Operator.
- There are changes to the
cluster-deployment.jsontemplates that reside within the
pgo-configConfigMap that could be breaking to those who have customized those templates. This includes removing the opening comma in the
exporter.jsonand removing unneeded match labels on the PostgreSQL cluster Deployment. This is resolved by following the standard upgrade procedure.(https://access.crunchydata.com/documentation/postgres-operator/latest/upgrade/), and only affects new clusters and existing clusters that wish to use the enable/disable metric collection feature. The
affinity.jsonentry in the
pgo-configConfigMap has been removed in favor of the updated node affinity support.
- Failovers can no longer be controlled by creating a
- Remove the
pgo-deployer. The metric collection user password is managed by the PostgreSQL Operator.
- Policy creation only supports the method of creating the policy from a file/ConfigMap.
- Any pgBackRest variables of the format
PGBACKREST_REPO_now follow the format
PGBACKREST_REPO1_to be consistent with what pgBackRest expects.
- Monitoring can now be enabled/disabled during the lifetime of a PostgreSQL cluster using the
pgo update --enable-metricsand
pgo update --disable-metricsflag. This can also be modified directly on a custom resource.
- The Service Type of a PostgreSQL cluster can now be updated during the lifetime of a cluster with
pgo update cluster --service-type. This can also be modified directly on a custom resource.
- The Service Type of pgBouncer can now be independently controlled and set with the
pgo create pgbouncerand
pgo update pgbouncer. This can also be modified directly on a custom resource.
pgBackRest delta restores, which can efficiently restore data as it determines which specific files need to be restored from backup, can now be used as part of the cluster creation method with
pgo create cluster --restore-from. For example, if a cluster is deleted as such:
pgo delete cluster hippo --keep-data --keep-backups
It can subsequently be recreated using the delta restore method as such:
pgo create cluster hippo --restore-from=hippo
Passing in the
--process-max option to
--restore-opts can help speed up the restore process based upon the amount of CPU you have available. If the delta restore fails, the PostgreSQL Operator will attempt to perform a full restore.
pgo restorewill now first attempt a pgBackRest delta restore, which can significantly speed up the restore time for large databases. Passing in the
--backup-optscan help speed up the restore process based upon the amount of CPU you have available.
- A pgBackRest backup can now be deleted with
pgo delete backup. A backup name must be specified with the
--targetflag. Please refer to the documentation for how to use this command.
pgo create clusternow accepts a
--labelflag that can be used to specify one or more custom labels for a PostgreSQL cluster. This replaces the
pgo delete labelcan accept a
--labelflag specified multiple times.
- pgBadger can now be enabled/disabled during the lifetime of a PostgreSQL cluster using the
pgo update --enable-pgbadgerand
pgo update --disable-pgbadgerflag. This can also be modified directly on a custom resource.
- Managed PostgreSQL system accounts and now have their credentials set and rotated with
pgo update userby including the
--set-system-account-passwordflag. Suggested by (@srinathganesh).
- If not provided at installation time, the Operator will now generate its own
localstorage type option for pgBackRest is deprecated in favor of
posix, which matches the pgBackRest term.
localwill still continue to work for backwards compatibility purposes.
- PostgreSQL clusters using multi-repository (e.g.
s3at the same time) archiving will now, by default, take backups to both repositories when
pgo backupis used without additional options.
- If not provided a cluster creation time, the Operator will now generate the PostgreSQL user Secrets required for bootstrap, including the superuser (
postgres), the replication user (
primaryuser), and the standard user.
crunchy-postgres-exporternow exposes several pgMonitor metrics related to
- When using the
pgo create clusterto create a new PostgreSQL cluster, the cluster bootstrap Job is now automatically removed if it completes successfully.
pgo failovercommand now works without specifying a target: the candidate to fail over to will be automatically selected.
- For clusters that have no healthy instances,
pgo failovercan now force a promotion using the
--targetflag must also be specified when using
- If a predefined custom ConfigMap for a PostgreSQL cluster (
-pgha-config) is detected at bootstrap time, the Operator will ensure it properly initializes the cluster.
- Deleting a
pgclusters.crunchydata.comcustom resource will now properly delete a PostgreSQL cluster. If the
pgclusters.crunchydata.comcustom resource has the annotations
keep-data, it will keep the backups or keep the PostgreSQL data directory respectively. Reported by Leo Khomenko (@lkhomenk).
- PostgreSQL JIT compilation is explicitly disabled on new cluster creation. This prevents a memory leak that has been observed on queries coming from the metrics exporter.
- The credentials for the metrics collection user are now available with
pgo show user --show-system-accounts.
- The default user for executing scheduled SQL policies is now the Postgres superuser, instead of the replication user.
- Add the
pgo upgrade. The mechanism to disable the prompt verification was already in place, but the flag was not exposed. Reported by (@devopsevd).
- Remove certain characters that causes issues in shell environments from consideration when using the random password generator, which is used to create default passwords or with
- Allow for the
--link-mapattribute for a pgBackRest option, which can help with the restore of an existing cluster to a new cluster that adds an external WAL volume.
- Remove the long deprecated
archivestorageattribute from the
pgclusters.crunchydata.comcustom resource definition. As this attribute is not used at all, this should have no effect.
ArchiveModeparameter is now removed from the configuration. This had been fully deprecated for awhile.
- Add an explicit size limit of
pgBadgerephemeral storage mount. Additionally, remove the ephemeral storage mount for the
/recovermount point as that is not used. Reported by Pierre-Marie Petit (@pmpetit).
- New PostgreSQL Operator deployments will now generate ECDSA keys (P-256, SHA384) for use by the API server.
- Ensure custom annotations are applied if the annotations are supposed to be applied globally but the cluster does not have a pgBouncer Deployment.
- Fix issue with UBI 8 / CentOS 8 when running a pgBackRest bootstrap or restore job, where duplicate “repo types” could be set. Specifically, the ensures the name of the repo type is set via the
PGBACKREST_REPO1_TYPEenvironmental variable. Reported by Alec Rooney (@alrooney).
- Fix issue where
pgo testwould indicate every Service was a replica if the cluster name contained the word
replicain it. Reported by Jose Joye (@jose-joye).
- Do not consider Evicted Pods as part of
pgo test. This eliminates a behavior where faux primaries are considered as part of
pgo test. Reported by Dennis Jacobfeuerborn (@dennisjac).
pgo dfto not fail in the event it tries to execute a command within a dangling container from the bootstrap process when
pgo create cluster --restore-fromis used. Reported by Ignacio J.Ortega (@IJOL).
pgo dfwill now only attempt to execute in running Pods, i.e. it does not attempt to run in evicted Pods. Reported by (@kseswar).
- Ensure the sync replication ConfigMap is removed when a cluster is deleted.
- Fix crash in shutdown logic when attempting to shut down a cluster where no primaries exist. Reported by Jeffrey den Drijver (@JeffreyDD).
- Fix syntax in recovery check command which could lead to failures when manually promoting a standby cluster. Reported by (@SockenSalat).
- Fix potential race condition that could lead to a crash in the Operator boot when an error is issued around loading the
pgo-configConfigMap. Reported by Aleksander Roszig (@AleksanderRoszig).
- Do not trigger a backup if a standby cluster fails over. Reported by (@aprilito1965).
- Ensure pgBouncer Secret is created when adding it to a standby cluster.
- Generally improvements to initialization of a standby cluster.
- Remove legacy
defaultModesetting on the volume instructions for the pgBackRest repo Secret as the
readOnlysetting is used on the mount itself. Reported by (@szhang1).
- Ensure proper label parsing based on Kubernetes rules and that it is consistently applied across all functionality that uses labels. Reported by José Joye (@jose-joye).
- The logger no longer defaults to using a log level of
- Autofailover is no longer disabled when an
rmdataJob is run, enabling a clean database shutdown process when deleting a PostgreSQL cluster.
- Allow for
RestartAPI server permission to be explicitly set. Reported by Aleksander Roszig (@AleksanderRoszig).
pgo-targetpermissions to match expectations for modern Kubernetes versions.
- Major upgrade container now includes references for
- During a major upgrade, ensure permissions are correct on the old data directory before running
- The metrics stack installer is fixed to work in environments that may not have connectivity to the Internet (“air gapped”). Reported by (@eliranw).