4.6.0
Crunchy Data announces the release of the PostgreSQL Operator 4.6.0 on January 22, 2021. You can get started with the PostgreSQL Operator with the following commands:
kubectl create namespace pgo
kubectl apply -f https://raw.githubusercontent.com/CrunchyData/postgres-operator/v4.6.0/installers/kubectl/postgres-operator.yml
The PostgreSQL Operator is released in conjunction with the Crunchy Container Suite.
The PostgreSQL Operator 4.6.0 release includes the following software versions upgrades:
- pgBackRest is now at version 2.31
- pgnodemx is now at version 1.0.3
- Patroni is now at version 2.0.1
- pgBadger is now at 11.4
The monitoring stack for the PostgreSQL Operator uses upstream components as opposed to repackaging them. These are specified as part of the PostgreSQL Operator Installer. We have tested this release with the following versions of each component:
- Prometheus: 2.24.0
- Grafana: 6.7.5
- Alertmanager: 0.21.0
This release of the PostgreSQL Operator drops support for PostgreSQL 9.5, which goes EOL in February 2021.
PostgreSQL Operator is tested against Kubernetes 1.17 - 1.20, OpenShift 3.11, OpenShift 4.4+, Google Kubernetes Engine (GKE), Amazon EKS, Microsoft AKS, and VMware Enterprise PKS 1.3+, and works on other Kubernetes distributions as well.
Major Features
Rolling Updates
During the lifecycle of a PostgreSQL cluster, there are certain events that may require a planned restart, such as an update to a “restart required” PostgreSQL configuration setting (e.g. shared_buffers
) or a change to a Kubernetes Deployment template (e.g. changing the memory request). Restarts can be disruptive in a high availability deployment, which is why many setups employ a “rolling update” strategy (aka a “rolling restart”) to minimize or eliminate downtime during a planned restart.
Because PostgreSQL is a stateful application, a simple rolling restart strategy will not work: PostgreSQL needs to ensure that there is a primary available that can accept reads and writes. This requires following a method that will minimize the amount of downtime when the primary is taken offline for a restart.
This release introduces a mechanism for the PostgreSQL Operator to perform rolling updates implicitly on certain operations that change the Deployment templates and explicitly through the pgo restart
command with the --rolling
flag. Some of the operations that will trigger a rolling update include:
- Memory resource adjustments
- CPU resource adjustments
- Custom annotation changes
- Tablespace additions
- Adding/removing the metrics sidecar to a PostgreSQL cluster
Please reference the documentation for more details on rolling updates.
Pod Tolerations
Kubernetes Tolerations can help with the scheduling of Pods to appropriate Nodes based upon the taint values of said Nodes. For example, a Kubernetes administrator may set taints on Nodes to restrict scheduling to just the database workload, and as such, tolerations must be assigned to Pods to ensure they can actually be scheduled on thos nodes.
This release introduces the ability to assign tolerations to PostgreSQL clusters managed by the PostgreSQL Operator. Tolerations can be assigned to every instance in the cluster via the tolerations
attribute on a pgclusters.crunchydata.com
custom resource, or to individual instances using the tolerations
attribute on a pgreplicas.crunchydata.com
custom resource.
Both the pgo create cluster
and pgo scale
commands support the --toleration
flag, which can be used to add one or more tolerations to a cluster. Values accepted by the --toleration
flag use the following format:
rule:Effect
where a rule
can represent existence (e.g. key
) or equality (key=value
) and Effect
is one of NoSchedule
, PreferNoSchedule
, or NoExecute
, e.g:
pgo create cluster hippo \
--toleration=ssd:NoSchedule \
--toleration=zone=east:NoSchedule
Tolerations can also be added and removed from an existing cluster using the pgo update cluster
, command e.g:
pgo update cluster hippo \
--toleration=zone=west:NoSchedule \
--toleration=zone=east:NoSchedule-
or by modifying the pgclusters.crunchydata.com
custom resource directly.
For more information on how tolerations work, please refer to the Kubernetes documentation.
Node Affinity Enhancements
Node affinity has been a feature of the PostgreSQL Operator for a long time but has received some significant improvements in this release.
It is now possible to control the node affinity across an entire PostgreSQL cluster as well as individual PostgreSQL instances from a custom resource attribute on the pgclusters.crunchydata.com
and pgreplicas.crunchydata.com
CRDs. These attributes use the standard Kubernetes specifications for node affinity and should be familiar to users who have had to set this in applications.
Additionally, this release adds support for both “preferred” and “required” node affinity definitions. Previously, one could achieve required node affinity by modifying a template in the pgo-config
ConfigMap, but this release makes this process more straightforward.
This release introduces the --node-affinity-type
flag for the pgo create cluster
, pgo scale
, and pgo restore
commands that allows one to specify the node affinity type for PostgreSQL clusters and instances. The --node-affinity-type
flag accepts values of preferred
(default) and required
. Each instance in a PostgreSQL cluster will inherit its node affinity type from the cluster (pgo create cluster
) itself, but the type of an individual instance (pgo scale
) will supersede that value.
The --node-affinity-type
must be combined with the --node-label
flag.
TLS for pgBouncer
Since 4.3.0, the PostgreSQL Operator has had support for TLS connections to PostgreSQL clusters and an improved integration with pgBouncer, used for connection pooling and state management. However, the integration with pgBouncer did not support TLS directly: it could be achieved through modifying the pgBouncer Deployment template.
This release brings TLS support for pgBouncer to the PostgreSQL Operator, allowing for communication over TLS between a client and pgBouncer, and pgBouncer and a PostgreSQL server. In other words, the following is now support:
Client
<= TLS => pgBouncer
<= TLS => PostgreSQL
In other words, to use TLS with pgBouncer, all connections from a client to pgBouncer and from pgBouncer to PostgreSQL must be over TLS. Effectively, this is “TLS only” mode if connecting via pgBouncer.
In order to deploy pgBouncer with TLS, the following preconditions must be met:
- TLS MUST be enabled within the PostgreSQL cluster.
- pgBouncer and the PostgreSQL MUST share the same certificate authority (CA) bundle.
You must have a Kubernetes TLS Secret containing the TLS keypair you would like to use for pgBouncer.
You can enable TLS for pgBouncer using the following commands:
pgo create pgbouncer --tls-secret
, where--tls-secret
specifies the location of the TLS keypair to use for pgBouncer. You must already have TLS enabled in your PostgreSQL cluster.pgo create cluster --pgbouncer --pgbouncer-tls-secret
, where--tls-secret
specifies the location of the TLS keypair to use for pgBouncer. You must also specify--server-tls-secret
and--server-ca-secret
.
This adds an attribute to the pgclusters.crunchydata.com
Customer Resource Definition in the pgBouncer
section called tlsSecret
, which will store the name of the TLS secret to use for pgBouncer.
By default, connections coming into pgBouncer have a PostgreSQL SSL mode of require
and connections going into PostgreSQL using verify-ca
.
Enable/Disable Metrics Collection for PostgreSQL Cluster
A common case is that one creates a PostgreSQL cluster with the Postgres Operator and forget to enable it for monitoring with the --metrics
flag. Prior to this release, adding the crunchy-postgres-exporter
to an already running PostgreSQL cluster presented challenges.
This release brings the --enable-metrics
and --disable-metrics
introduces to the pgo update cluster
flags that allow for monitoring to be enabled or disabled on an already running PostgreSQL cluster. As this involves modifying Deployment templates, this action triggers a rolling update that is described in the previous section to limit downtime.
Metrics can also be enabled/disabled using the exporter
attribute on the pgclusters.crunchydata.com
custom resource.
This release also changes the management of the PostgreSQL user that is used to collect the metrics. Similar to pgBouncer, the PostgreSQL Operator fully manages the credentials for the metrics collection user. The --exporter-rotate-password
flag on pgo update cluster
can be used to rotate the metric collection user’s credentials.
Container Image Reduction & Reorganization
Advances in Postgres Operator functionality have allowed for a culling of the number of required container images. For example, functionality that had been broken out into individual container images (e.g. crunchy-pgdump
) is now consolidated within the crunchy-postgres
and crunchy-postgres-ha
containers.
Renamed container images include:
pgo-backrest
=>crunchy-pgbackrest
pgo-backrest-repo
=>crunchy-pgbackrest-repo
Removed container images include:
crunchy-admin
crunchy-backrest-restore
crunchy-backup
crunchy-pgbasebackup-restore
crunchy-pgbench
crunchy-pgdump
crunchy-pgrestore
pgo-sqlrunner
pgo-backrest-repo-sync
pgo-backrest-restore
These changes also include overall organization and build performance optimizations around the container suite.
Breaking Changes
- Metrics collection can now be enabled/disabled using the
exporter
attribute onpgclusters.crunchydata.com
. The previous method to do so, involving a label buried within a custom resource, no longer works. - pgBadger can now be enabled/disabled using the
pgBadger
attribute onpgclusters.crunchydata.com
. The previous method to do so, involving a label buried within a custom resource, no longer works. - Several additional labels on the
pgclusters.crunchydata.com
CRD that had driven behavior have been moved to attributes. These include:autofail
, which is now represented by thedisableAutofail
attribute.service-type
, which is now represented by theserviceType
attribute.NodeLabelKey
/NodeLabelValue
, which is now replaced by thenodeAffinity
attribute.backrest-storage-type
, which is now represented with thebackrestStorageTypes
attribute.
- The
--labels
flag onpgo create cluster
is removed and replaced with the--label
, which can be specified multiple times. The API endpoint forpgo create cluster
is also modified: labels must now be passed in as a set of key-value pairs. Please see the “Features” section for more details. - The API endpoints for
pgo label
andpgo delete label
is modified to accept a set of key/value pairs for the values of the--label
flag. The API parameter for this is now calledLabels
. Thepgo upgrade
command will properly moved any data you have in these labels into the correct attributes. You can read more about how to use the various CRD attributes in the Custom Resources section of the documentation. - The
rootsecretname
,primarysecretname
, andusersecretname
attributes on thepgclusters.crunchydata.com
CRD have been removed. Each of these represented managed Secrets. Additionally, if the managed Secrets are not created at cluster creation time, the Operator will now generate these Secrets. - The
collectSecretName
attribute onpgclusters.crunchydata.com
has been removed. The Secret for the metrics collection user is now fully managed by the PostgreSQL Operator. - There are changes to the
exporter.json
andcluster-deployment.json
templates that reside within thepgo-config
ConfigMap that could be breaking to those who have customized those templates. This includes removing the opening comma in theexporter.json
and removing unneeded match labels on the PostgreSQL cluster Deployment. This is resolved by following the standard upgrade procedure.(https://access.crunchydata.com/documentation/postgres-operator/latest/upgrade/), and only affects new clusters and existing clusters that wish to use the enable/disable metric collection feature. Theaffinity.json
entry in thepgo-config
ConfigMap has been removed in favor of the updated node affinity support. - Failovers can no longer be controlled by creating a
pgtasks.crunchydata.com
custom resource. - Remove the
PgMonitorPassword
attribute frompgo-deployer
. The metric collection user password is managed by the PostgreSQL Operator. - Policy creation only supports the method of creating the policy from a file/ConfigMap.
- Any pgBackRest variables of the format
PGBACKREST_REPO_
now follow the formatPGBACKREST_REPO1_
to be consistent with what pgBackRest expects.
Features
- Monitoring can now be enabled/disabled during the lifetime of a PostgreSQL cluster using the
pgo update --enable-metrics
andpgo update --disable-metrics
flag. This can also be modified directly on a custom resource. - The Service Type of a PostgreSQL cluster can now be updated during the lifetime of a cluster with
pgo update cluster --service-type
. This can also be modified directly on a custom resource. - The Service Type of pgBouncer can now be independently controlled and set with the
--service-type
flag onpgo create pgbouncer
andpgo update pgbouncer
. This can also be modified directly on a custom resource. pgBackRest delta restores, which can efficiently restore data as it determines which specific files need to be restored from backup, can now be used as part of the cluster creation method with
pgo create cluster --restore-from
. For example, if a cluster is deleted as such:pgo delete cluster hippo --keep-data --keep-backups
It can subsequently be recreated using the delta restore method as such:
pgo create cluster hippo --restore-from=hippo
Passing in the --process-max
option to --restore-opts
can help speed up the restore process based upon the amount of CPU you have available. If the delta restore fails, the PostgreSQL Operator will attempt to perform a full restore.
pgo restore
will now first attempt a pgBackRest delta restore, which can significantly speed up the restore time for large databases. Passing in the--process-max
option to--backup-opts
can help speed up the restore process based upon the amount of CPU you have available.- A pgBackRest backup can now be deleted with
pgo delete backup
. A backup name must be specified with the--target
flag. Please refer to the documentation for how to use this command. pgo create cluster
now accepts a--label
flag that can be used to specify one or more custom labels for a PostgreSQL cluster. This replaces the--labels
flag.pgo label
andpgo delete label
can accept a--label
flag specified multiple times.- pgBadger can now be enabled/disabled during the lifetime of a PostgreSQL cluster using the
pgo update --enable-pgbadger
andpgo update --disable-pgbadger
flag. This can also be modified directly on a custom resource. - Managed PostgreSQL system accounts and now have their credentials set and rotated with
pgo update user
by including the--set-system-account-password
flag. Suggested by (@srinathganesh).
Changes
- If not provided at installation time, the Operator will now generate its own
pgo-backrest-repo-config
Secret. - The
local
storage type option for pgBackRest is deprecated in favor ofposix
, which matches the pgBackRest term.local
will still continue to work for backwards compatibility purposes. - PostgreSQL clusters using multi-repository (e.g.
posix
+s3
at the same time) archiving will now, by default, take backups to both repositories whenpgo backup
is used without additional options. - If not provided a cluster creation time, the Operator will now generate the PostgreSQL user Secrets required for bootstrap, including the superuser (
postgres
), the replication user (primaryuser
), and the standard user. crunchy-postgres-exporter
now exposes several pgMonitor metrics related topg_stat_statements
.- When using the
--restore-from
option onpgo create cluster
to create a new PostgreSQL cluster, the cluster bootstrap Job is now automatically removed if it completes successfully. - The
pgo failover
command now works without specifying a target: the candidate to fail over to will be automatically selected. - For clusters that have no healthy instances,
pgo failover
can now force a promotion using the--force
flag. A--target
flag must also be specified when using--force
. - If a predefined custom ConfigMap for a PostgreSQL cluster (
-pgha-config
) is detected at bootstrap time, the Operator will ensure it properly initializes the cluster. - Deleting a
pgclusters.crunchydata.com
custom resource will now properly delete a PostgreSQL cluster. If thepgclusters.crunchydata.com
custom resource has the annotationskeep-backups
orkeep-data
, it will keep the backups or keep the PostgreSQL data directory respectively. Reported by Leo Khomenko (@lkhomenk). - PostgreSQL JIT compilation is explicitly disabled on new cluster creation. This prevents a memory leak that has been observed on queries coming from the metrics exporter.
- The credentials for the metrics collection user are now available with
pgo show user --show-system-accounts
. - The default user for executing scheduled SQL policies is now the Postgres superuser, instead of the replication user.
- Add the
--no-prompt
flag topgo upgrade
. The mechanism to disable the prompt verification was already in place, but the flag was not exposed. Reported by (@devopsevd). - Remove certain characters that causes issues in shell environments from consideration when using the random password generator, which is used to create default passwords or with
--rotate-password
. - Allow for the
--link-map
attribute for a pgBackRest option, which can help with the restore of an existing cluster to a new cluster that adds an external WAL volume. - Remove the long deprecated
archivestorage
attribute from thepgclusters.crunchydata.com
custom resource definition. As this attribute is not used at all, this should have no effect. - The
ArchiveMode
parameter is now removed from the configuration. This had been fully deprecated for awhile. - Add an explicit size limit of
64Mi
for thepgBadger
ephemeral storage mount. Additionally, remove the ephemeral storage mount for the/recover
mount point as that is not used. Reported by Pierre-Marie Petit (@pmpetit). - New PostgreSQL Operator deployments will now generate ECDSA keys (P-256, SHA384) for use by the API server.
Fixes
- Ensure custom annotations are applied if the annotations are supposed to be applied globally but the cluster does not have a pgBouncer Deployment.
- Fix issue with UBI 8 / CentOS 8 when running a pgBackRest bootstrap or restore job, where duplicate “repo types” could be set. Specifically, the ensures the name of the repo type is set via the
PGBACKREST_REPO1_TYPE
environmental variable. Reported by Alec Rooney (@alrooney). - Fix issue where
pgo test
would indicate every Service was a replica if the cluster name contained the wordreplica
in it. Reported by Jose Joye (@jose-joye). - Do not consider Evicted Pods as part of
pgo test
. This eliminates a behavior where faux primaries are considered as part ofpgo test
. Reported by Dennis Jacobfeuerborn (@dennisjac). - Fix
pgo df
to not fail in the event it tries to execute a command within a dangling container from the bootstrap process whenpgo create cluster --restore-from
is used. Reported by Ignacio J.Ortega (@IJOL). pgo df
will now only attempt to execute in running Pods, i.e. it does not attempt to run in evicted Pods. Reported by (@kseswar).- Ensure the sync replication ConfigMap is removed when a cluster is deleted.
- Fix crash in shutdown logic when attempting to shut down a cluster where no primaries exist. Reported by Jeffrey den Drijver (@JeffreyDD).
- Fix syntax in recovery check command which could lead to failures when manually promoting a standby cluster. Reported by (@SockenSalat).
- Fix potential race condition that could lead to a crash in the Operator boot when an error is issued around loading the
pgo-config
ConfigMap. Reported by Aleksander Roszig (@AleksanderRoszig). - Do not trigger a backup if a standby cluster fails over. Reported by (@aprilito1965).
- Ensure pgBouncer Secret is created when adding it to a standby cluster.
- Generally improvements to initialization of a standby cluster.
- Remove legacy
defaultMode
setting on the volume instructions for the pgBackRest repo Secret as thereadOnly
setting is used on the mount itself. Reported by (@szhang1). - Ensure proper label parsing based on Kubernetes rules and that it is consistently applied across all functionality that uses labels. Reported by José Joye (@jose-joye).
- The logger no longer defaults to using a log level of
DEBUG
. - Autofailover is no longer disabled when an
rmdata
Job is run, enabling a clean database shutdown process when deleting a PostgreSQL cluster. - Allow for
Restart
API server permission to be explicitly set. Reported by Aleksander Roszig (@AleksanderRoszig). - Update
pgo-target
permissions to match expectations for modern Kubernetes versions. - Major upgrade container now includes references for
pgnodemx
. - During a major upgrade, ensure permissions are correct on the old data directory before running
pg_upgrade
. - The metrics stack installer is fixed to work in environments that may not have connectivity to the Internet (“air gapped”). Reported by (@eliranw).