Customize a Postgres Cluster
Postgres is known for its ease of customization; PGO helps you to roll out changes efficiently and without disruption. Let's see how we can easily tweak our Postgres configuration.
Custom Postgres Configuration
Part of the trick of managing multiple instances in a Postgres cluster is ensuring all of the configuration changes are propagated to each of them. This is where PGO helps: when you make a Postgres configuration change for a cluster, PGO will apply it to all of the Postgres instances.
For example, let's say we wanted to tweak the Postgres settings max_parallel_workers
, max_worker_processes
, shared_buffers
, and work_mem
while also requiring SCRAM-SHA-256 as the authentication method for most connections. We can do this in the spec.patroni.dynamicConfiguration
section and the changes will be applied to all instances. Here is an example updated manifest that tweaks those settings:
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: hippo
spec:
postgresVersion: 15
instances:
- name: instance1
replicas: 2
dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
backups:
pgbackrest:
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
patroni:
dynamicConfiguration:
postgresql:
parameters:
max_parallel_workers: 2
max_worker_processes: 2
shared_buffers: 1GB
work_mem: 2MB
pg_hba:
- "hostssl all all all scram-sha-256"
In particular, we added the following to spec
:
patroni:
dynamicConfiguration:
postgresql:
parameters:
max_parallel_workers: 2
max_worker_processes: 2
shared_buffers: 1GB
work_mem: 2MB
Apply these updates to your Postgres cluster with the following command:
kubectl apply -k kustomize/postgres
PGO will go and apply these settings, restarting each Postgres instance when necessary. You can verify that the changes are present using the Postgres SHOW
command, e.g.
SHOW work_mem;
should yield something similar to:
work_mem
----------
2MB
Customize TLS
All connections in PGO use TLS to encrypt communication between components. PGO sets up a PKI and certificate authority (CA) that allow you create verifiable endpoints. However, you may want to bring a different TLS infrastructure based upon your organizational requirements. The good news: PGO lets you do this!
If you want to use the TLS infrastructure that PGO provides, you can skip the rest of this section and move on to learning how to add custom labels.
How to Customize TLS
There are a few different TLS endpoints that can be customized for PGO, including those of the Postgres cluster and controlling how Postgres instances authenticate with each other. Let's look at how we can customize TLS by defining
- a
spec.customTLSSecret
, used to both identify the cluster and encrypt communications - a
spec.customReplicationTLSSecret
, used for replication authentication
(For more information on the spec.customTLSSecret
and spec.customReplicationTLSSecret
fields, see the PostgresCluster CRD
.)
To customize the TLS for a Postgres cluster, you will need to create two Secrets in the Namespace of your Postgres cluster. One of these Secrets will be the customTLSSecret
and the other will be the customReplicationTLSSecret
. Both secrets contain a TLS key (tls.key
), TLS certificate (tls.crt
) and CA certificate (ca.crt
) to use.
Note: If spec.customTLSSecret
is provided you must also provide spec.customReplicationTLSSecret
and both must contain the same ca.crt
.
The custom TLS and custom replication TLS Secrets should contain the following fields (though see below for a workaround if you cannot control the field names of the Secret's data
):
data:
ca.crt: $VALUE
tls.crt: $VALUE
tls.key: $VALUE
For example, if you have files named ca.crt
, hippo.key
, and hippo.crt
stored on your local machine, you could run the following command to create a Secret from those files:
kubectl create secret generic -n postgres-operator hippo-cluster.tls --from-file=ca.crt=ca.crt --from-file=tls.key=hippo.key --from-file=tls.crt=hippo.crt
After you create the Secrets, you can specify the custom TLS Secret in your postgrescluster.postgres-operator.crunchydata.com
custom resource. For example, if you created a hippo-cluster.tls
Secret and a hippo-replication.tls
Secret, you would add them to your Postgres cluster:
spec:
customTLSSecret:
name: hippo-cluster.tls
customReplicationTLSSecret:
name: hippo-replication.tls
If you're unable to control the key-value pairs in the Secret, you can create a mapping to tell the Postgres Operator what key holds the expected value. That would look similar to this:
spec:
customTLSSecret:
name: hippo.tls
items:
- key: <tls.crt key in the referenced hippo.tls Secret>
path: tls.crt
- key: <tls.key key in the referenced hippo.tls Secret>
path: tls.key
- key: <ca.crt key in the referenced hippo.tls Secret>
path: ca.crt
For instance, if the hippo.tls
Secret had the tls.crt
in a key named hippo-tls.crt
, the tls.key
in a key named hippo-tls.key
, and the ca.crt
in a key named hippo-ca.crt
, then your mapping would look like:
spec:
customTLSSecret:
name: hippo.tls
items:
- key: hippo-tls.crt
path: tls.crt
- key: hippo-tls.key
path: tls.key
- key: hippo-ca.crt
path: ca.crt
Note: Although the custom TLS and custom replication TLS Secrets share the same ca.crt
, they do not share the same tls.crt
:
- Your
spec.customTLSSecret
TLS certificate should have a Common Name (CN) setting that matches the primary Service name. This is the name of the cluster suffixed with-primary
. For example, for ourhippo
cluster this would behippo-primary
. - Your
spec.customReplicationTLSSecret
TLS certificate should have a Common Name (CN) setting that matches_crunchyrepl
, which is the preset replication user.
As with the other changes, you can roll out the TLS customizations with kubectl apply
.
Labels
There are several ways to add your own custom Kubernetes Labels to your Postgres cluster.
- Cluster: You can apply labels to any PGO managed object in a cluster by editing the
spec.metadata.labels
section of the custom resource. - Postgres: You can apply labels to a Postgres instance set and its objects by editing
spec.instances.metadata.labels
. - pgBackRest: You can apply labels to pgBackRest and its objects by editing
postgresclusters.spec.backups.pgbackrest.metadata.labels
. - PgBouncer: You can apply labels to PgBouncer connection pooling instances by editing
spec.proxy.pgBouncer.metadata.labels
.
Annotations
There are several ways to add your own custom Kubernetes Annotations to your Postgres cluster.
- Cluster: You can apply annotations to any PGO managed object in a cluster by editing the
spec.metadata.annotations
section of the custom resource. - Postgres: You can apply annotations to a Postgres instance set and its objects by editing
spec.instances.metadata.annotations
. - pgBackRest: You can apply annotations to pgBackRest and its objects by editing
spec.backups.pgbackrest.metadata.annotations
. - PgBouncer: You can apply annotations to PgBouncer connection pooling instances by editing
spec.proxy.pgBouncer.metadata.annotations
.
Pod Priority Classes
PGO allows you to use pod priority classes to indicate the relative importance of a pod by setting a priorityClassName
field on your Postgres cluster. This can be done as follows:
- Instances: Priority is defined per instance set and is applied to all Pods in that instance set by editing the
spec.instances.priorityClassName
section of the custom resource. - Dedicated Repo Host: Priority defined under the repoHost section of the spec is applied to the dedicated repo host by editing the
spec.backups.pgbackrest.repoHost.priorityClassName
section of the custom resource. - PgBouncer: Priority is defined under the pgBouncer section of the spec and will apply to all PgBouncer Pods by editing the
spec.proxy.pgBouncer.priorityClassName
section of the custom resource. - Backup (manual and scheduled): Priority is defined under the
spec.backups.pgbackrest.jobs.priorityClassName
section and applies that priority to all pgBackRest backup Jobs (manual and scheduled). - Restore (data source or in-place): Priority is defined for either a "data source" restore or an in-place restore by editing the
spec.dataSource.postgresCluster.priorityClassName
section of the custom resource. - Data Migration: The priority defined for the first instance set in the spec (array position 0) is used for the PGDATA and WAL migration Jobs. The pgBackRest repo migration Job will use the priority class applied to the repoHost.
Separate WAL PVCs
PostgreSQL commits transactions by storing changes in its Write-Ahead Log (WAL). Because the way WAL files are accessed and utilized often differs from that of data files, and in high-performance situations, it can desirable to put WAL files on separate storage volume. With PGO, this can be done by adding the walVolumeClaimSpec
block to your desired instance in your PostgresCluster spec, either when your cluster is created or anytime thereafter:
spec:
instances:
- name: instance
walVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
This volume can be removed later by removing the walVolumeClaimSpec
section from the instance. Note that when changing the WAL directory, care is taken so as not to lose any WAL files. PGO only deletes the PVC once there are no longer any WAL files on the previously configured volume.
Custom Sidecar Containers
PGO allows you to configure custom sidecar Containers for your PostgreSQL instance and pgBouncer Pods.
To use the custom sidecar features, you will need to enable them via the PGO feature gate.
PGO feature gates are enabled by setting the PGO_FEATURE_GATES
environment variable on the PGO Deployment. For a feature named 'FeatureName', that would look like
PGO_FEATURE_GATES="FeatureName=true"
Please note that it is possible to enable more than one feature at a time as this variable accepts a comma delimited list, for example:
PGO_FEATURE_GATES="FeatureName=true,FeatureName2=true,FeatureName3=true..."
Warning
Any feature name added to PGO_FEATURE_GATES
must be defined by PGO and must be
set to true or false. Any misconfiguration will prevent PGO from deploying. See
the considerations below for additional guidance.
Custom Sidecar Containers for PostgreSQL Instance Pods
To configure custom sidecar Containers for any of your PostgreSQL instance Pods you will need to enable that feature via the PGO feature gate.
As mentioned above, PGO feature gates are enabled by setting the PGO_FEATURE_GATES
environment variable on the PGO Deployment. For the PostgreSQL instance sidecar container feature, that will be
PGO_FEATURE_GATES="InstanceSidecars=true"
Once this feature is enabled, you can add your custom Containers as an array to spec.instances.containers
. See the custom sidecar example below for more information!
Custom Sidecar Containers for pgBouncer Pods
Similar to your PostgreSQL instance Pods, to configure custom sidecar Containers for your pgBouncer Pods you will need to enable it via the PGO feature gate.
As mentioned above, PGO feature gates are enabled by setting the PGO_FEATURE_GATES
environment variable on the PGO Deployment. For the pgBouncer custom sidecar container feature, that will be
PGO_FEATURE_GATES="PGBouncerSidecars=true"
Once this feature is enabled, you can add your custom Containers as an array to spec.proxy.pgBouncer.containers
. See the custom sidecar example below for more information!
Custom Sidecar Example
As a simple example, consider
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: sidecar-hippo
spec:
postgresVersion: 15
instances:
- name: instance1
containers:
- name: testcontainer
image: mycontainer1:latest
- name: testcontainer2
image: mycontainer1:latest
dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
backups:
pgbackrest:
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
proxy:
pgBouncer:
containers:
- name: bouncertestcontainer1
image: mycontainer1:latest
In the above example, we've added two sidecar Containers to the instance1
Pod and one sidecar container to the pgBouncer
Pod. These Containers can be defined in the manifest at any time, but the Containers will not be added to their respective Pods until the feature gate is enabled.
Considerations
- Volume mounts and other Pod details are subject to change between releases.
- The custom sidecar features are currently feature-gated. Any sidecar Containers, as well as any settings included in their configuration, are added and used at your own risk. Improperly configured sidecar Containers could impact the health and/or security of your PostgreSQL cluster!
- When adding a sidecar container, we recommend adding a unique prefix to the container name to avoid potential naming conflicts with the official PGO containers.
Database Initialization SQL
PGO can run SQL for you as part of the cluster creation and initialization process. PGO runs the SQL using the psql client so you can use meta-commands to connect to different databases, change error handling, or set and use variables. Its capabilities are described in the psql documentation.
Initialization SQL ConfigMap
The Postgres cluster spec accepts a reference to a ConfigMap containing your init SQL file. Update your cluster spec to include the ConfigMap name, spec.databaseInitSQL.name
, and the data key, spec.databaseInitSQL.key
, for your SQL file. For example, if you create your ConfigMap with the following command:
kubectl -n postgres-operator create configmap hippo-init-sql --from-file=init.sql=/path/to/init.sql
You would add the following section to your Postgrescluster spec:
spec:
databaseInitSQL:
key: init.sql
name: hippo-init-sql
Info
The ConfigMap must exist in the same namespace as your Postgres cluster.
After you add the ConfigMap reference to your spec, apply the change with kubectl apply -k kustomize/postgres
. PGO will create your hippo
cluster and run your initialization SQL once the cluster has started. You can verify that your SQL has been run by checking the databaseInitSQL
status on your Postgres cluster. While the status is set, your init SQL will not be run again. You can check cluster status with the kubectl describe
command:
kubectl -n postgres-operator describe postgresclusters.postgres-operator.crunchydata.com hippo
Warning
In some cases, due to how Kubernetes treats PostgresCluster status, PGO may run your SQL commands more than once. Please ensure that the commands defined in your init SQL are idempotent.
Now that databaseInitSQL
is defined in your cluster status, verify database objects have been created as expected. After verifying, we recommend removing the spec.databaseInitSQL
field from your spec. Removing the field from the spec will also remove databaseInitSQL
from the cluster status.
PSQL Usage
PGO uses the psql interactive terminal to execute SQL statements in your database. Statements are passed in using standard input and the filename flag (e.g. psql -f -
).
SQL statements are executed as superuser in the default maintenance database. This means you have full control to create database objects, extensions, or run any SQL statements that you might need.
Integration with User and Database Management
If you are creating users or databases, please see the User/Database Management documentation. Databases created through the user management section of the spec can be referenced in your initialization sql. For example, if a database zoo
is defined:
spec:
users:
- name: hippo
databases:
- "zoo"
You can connect to zoo
by adding the following psql
meta-command to your SQL:
\c zoo
create table t_zoo as select s, md5(random()::text) from generate_Series(1,5) s;
Transaction support
By default, psql
commits each SQL command as it completes. To combine multiple commands into a single transaction, use the BEGIN
and COMMIT
commands.
BEGIN;
create table t_random as select s, md5(random()::text) from generate_Series(1,5) s;
COMMIT;
PSQL Exit Code and Database Init SQL Status
The exit code from psql
will determine when the databaseInitSQL
status is set. When psql
returns 0
the status will be set and SQL will not be run again. When psql
returns with an error exit code the status will not be set. PGO will continue attempting to execute the SQL as part of its reconcile loop until psql
returns normally. If psql
exits with a failure, you will need to edit the file in your ConfigMap to ensure your SQL statements will lead to a successful psql
return. The easiest way to make live changes to your ConfigMap is to use the following kubectl edit
command:
kubectl -n postgres-operator edit configmap hippo-init-sql
Be sure to transfer any changes back over to your local file. Another option is to make changes in your local file and use kubectl --dry-run
to create a template and pipe the output into kubectl apply
:
kubectl -n postgres-operator create configmap hippo-init-sql --from-file=init.sql=/path/to/init.sql --dry-run=client -o yaml | kubectl apply -f -
Hint
If you edit your ConfigMap and your changes aren't showing up, you may be waiting
for PGO to reconcile your cluster. After some time, PGO will automatically reconcile
the cluster or you can trigger reconciliation by applying any change to your cluster
(e.g. with kubectl apply -k kustomize/postgres
).
To ensure that psql
returns a failure exit code when your SQL commands fail, set the ON_ERROR_STOP
variable as part of your SQL file:
\set ON_ERROR_STOP
\echo Any error will lead to exit code 3
create table t_random as select s, md5(random()::text) from generate_Series(1,5) s;
Troubleshooting
Changes Not Applied
If your Postgres configuration settings are not present, ensure that you are using the syntax that Postgres expects. You can see this in the Postgres configuration documentation.
Next Steps
You've now seen how you can further customize your Postgres cluster. Let's move on to some administrative tasks you might need to complete while maintaining your Postgres database.