Connection Pooling
Connection pooling can be helpful for scaling and maintaining overall availability between your application and the database. PGO helps facilitate this by supporting the PgBouncer connection pooler and state manager.
Let’s look at how we can a connection pooler and connect it to our application!
Adding a Connection Pooler
Let’s look at how we can add a connection pooler using the kustomize/keycloak
example in the Postgres Operator examples repository.
Connection poolers are added using the spec.proxy
section of the custom resource. Currently, the only connection pooler supported is PgBouncer.
The only required attribute for adding a PgBouncer connection pooler is to set the spec.proxy.pgBouncer.image
attribute. In the kustomize/keycloak/postgres.yaml
file, add the following YAML to the spec:
proxy:
pgBouncer:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:centos8-1.16-0
(You can also find an example of this in the kustomize/examples/high-availability
example).
Save your changes and run:
kubectl apply -k kustomize/keycloak
PGO will detect the change and create a new PgBouncer Deployment!
That was fairly easy to set up, so now let’s look at how we can connect our application to the connection pooler.
Connecting to a Connection Pooler
When a connection pooler is deployed to the cluster, PGO adds additional information to the user Secrets to allow for applications to connect directly to the connection pooler. Recall that in this example, our user Secret is called keycloakdb-pguser-keycloakdb
. Describe the user Secret:
kubectl -n postgres-operator describe secrets keycloakdb-pguser-keycloakdb
You should see that there are several new attributes included in this Secret that allow for you to connect to your Postgres instance via the connection pooler:
pgbouncer-host
: The name of the host of the PgBouncer connection pooler. This references the Service of the PgBouncer connection pooler.pgbouncer-port
: The port that the PgBouncer connection pooler is listening on.pgbouncer-uri
: A PostgreSQL connection URI that provides all the information for logging into the Postgres database via the PgBouncer connection pooler.pgbouncer-jdbc-uri
: A PostgreSQL JDBC connection URI that provides all the information for logging into the Postgres database via the PgBouncer connection pooler using the JDBC driver. Note that by default, the connection string disable JDBC managing prepared transactions for optimal use with PgBouncer.
Open up the file in kustomize/keycloak/keycloak.yaml
. Update the DB_ADDR
and DB_PORT
values to be the following:
- name: DB_ADDR
valueFrom: { secretKeyRef: { name: keycloakdb-pguser-keycloakdb, key: pgbouncer-host } }
- name: DB_PORT
valueFrom: { secretKeyRef: { name: keycloakdb-pguser-keycloakdb, key: pgbouncer-port } }
This changes Keycloak’s configuration so that it will now connect through the connection pooler.
Apply the changes:
kubectl apply -k kustomize/keycloak
Kubernetes will detect the changes and begin to deploy a new Keycloak Pod. When it is completed, Keycloak will now be connected to Postgres via the PgBouncer connection pooler!
TLS
PGO deploys every cluster and component over TLS. This includes the PgBouncer connection pooler. If you are using your own custom TLS setup, you will need to provide a Secret reference for a TLS key / certificate pair for PgBouncer in spec.proxy.pgBouncer.customTLSSecret
.
Your TLS certificate for PgBouncer should have a Common Name (CN) setting that matches the PgBouncer Service name. This is the name of the cluster suffixed with -pgbouncer
. For example, for our hippo
cluster this would be hippo-pgbouncer
. For the keycloakdb
example, it would be keycloakdb-pgbouncer
.
To customize the TLS for PgBouncer, you will need to create a Secret in the Namespace of your Postgres cluster that contains the TLS key (tls.key
), TLS certificate (tls.crt
) and the CA certificate (ca.crt
) to use. The Secret should contain the following values:
data:
ca.crt: <value>
tls.crt: <value>
tls.key: <value>
For example, if you have files named ca.crt
, keycloakdb-pgbouncer.key
, and keycloakdb-pgbouncer.crt
stored on your local machine, you could run the following command:
kubectl create secret generic -n postgres-operator keycloakdb-pgbouncer.tls \
--from-file=ca.crt=ca.crt \
--from-file=tls.key=keycloakdb-pgbouncer.key \
--from-file=tls.crt=keycloakdb-pgbouncer.crt
You can specify the custom TLS Secret in the spec.proxy.pgBouncer.customTLSSecret.name
field in your postgrescluster.postgres-operator.crunchydata.com
custom resource, e.g.:
spec:
proxy:
pgBouncer:
customTLSSecret:
name: keycloakdb-pgbouncer.tls
Customizing
The PgBouncer connection pooler is highly customizable, both from a configuration and Kubernetes deployment standpoint. Let’s explore some of the customizations that you can do!
Configuration
PgBouncer configuration can be customized through spec.proxy.pgBouncer.config
. After making configuration changes, PGO will roll them out to any PgBouncer instance and automatically issue a “reload”.
There are several ways you can customize the configuration:
spec.proxy.pgBouncer.config.global
: Accepts key-value pairs that apply changes globally to PgBouncer.spec.proxy.pgBouncer.config.databases
: Accepts key-value pairs that represent PgBouncer database definitions.spec.proxy.pgBouncer.config.users
: Accepts key-value pairs that represent connection settings applied to specific users.spec.proxy.pgBouncer.config.files
: Accepts a list of files that are mounted in the/etc/pgbouncer
directory and loaded before any other options are considered using PgBouncer’s include directive.
For example, to set the connection pool mode to transaction
, you would set the following configuration:
spec:
proxy:
pgBouncer:
config:
global:
pool_mode: transaction
For a reference on PgBouncer configuration please see:
https://www.pgbouncer.org/config.html
Replicas
PGO deploys one PgBouncer instance by default. You may want to run multiple PgBouncer instances to have some level of redundancy, though you still want to be mindful of how many connections are going to your Postgres database!
You can manage the number of PgBouncer instances that are deployed through the spec.proxy.pgBouncer.replicas
attribute.
Resources
You can manage the CPU and memory resources given to a PgBouncer instance through the spec.proxy.pgBouncer.resources
attribute. The layout of spec.proxy.pgBouncer.resources
should be familiar: it follows the same pattern as the standard Kubernetes structure for setting container resources.
For example, let’s say we want to set some CPU and memory limits on our PgBouncer instances. We could add the following configuration:
spec:
proxy:
pgBouncer:
resources:
limits:
cpu: 200m
memory: 128Mi
As PGO deploys the PgBouncer instances using a Deployment these changes are rolled out using a rolling update to minimize disruption between your application and Postgres instances!
Annotations / Labels
You can apply custom annotations and labels to your PgBouncer instances through the spec.proxy.pgBouncer.metadata.annotations
and spec.proxy.pgBouncer.metadata.labels
attributes respectively. Note that any changes to either of these two attributes take precedence over any other custom labels you have added.
Pod Anti-Affinity / Pod Affinity / Node Affinity
You can control the pod anti-affinity, pod affinity, and node affinity through the spec.proxy.pgBouncer.affinity
attribute, specifically:
spec.proxy.pgBouncer.affinity.nodeAffinity
: controls node affinity for the PgBouncer instances.spec.proxy.pgBouncer.affinity.podAffinity
: controls Pod affinity for the PgBouncer instances.spec.proxy.pgBouncer.affinity.podAntiAffinity
: controls Pod anti-affinity for the PgBouncer instances.
Each of the above follows the standard Kubernetes specification for setting affinity.
For example, to set a preferred Pod anti-affinity rule for the kustomize/keycloak
example, you would want to add the following to your configuration:
spec:
proxy:
pgBouncer:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: keycloakdb
postgres-operator.crunchydata.com/role: pgbouncer
topologyKey: kubernetes.io/hostname
Tolerations
You can deploy PgBouncer instances to Nodes with Taints by setting Tolerations through spec.proxy.pgBouncer.tolerations
. This attribute follows the Kubernetes standard tolerations layout.
For example, if there were a set of Nodes with a Taint of role=connection-poolers:NoSchedule
that you want to schedule your PgBouncer instances to, you could apply the following configuration:
spec:
proxy:
pgBouncer:
tolerations:
- effect: NoSchedule
key: role
operator: Equal
value: connection-poolers
Note that setting a toleration does not necessarily mean that the PgBouncer instances will be assigned to Nodes with those taints. Tolerations act as a key: they allow for you to access Nodes. If you want to ensure that your PgBouncer instances are deployed to specific nodes, you need to combine setting tolerations with node affinity.
Pod Spread Constraints
Besides using affinity, anti-affinity and tolerations, you can also set Topology Spread Constraints through spec.proxy.pgBouncer.topologySpreadConstraints
. This attribute follows the Kubernetes standard topology spread contraint layout.
For example, since each of of our pgBouncer Pods will have the standard postgres-operator.crunchydata.com/role: pgbouncer
Label set, we can use this Label when determining the maxSkew
. In the example below, since we have 3 nodes with a maxSkew
of 1 and we’ve set whenUnsatisfiable
to ScheduleAnyway
, we should ideally see 1 Pod on each of the nodes, but our Pods can be distributed less evenly if other constraints keep this from happening.
proxy:
pgBouncer:
replicas: 3
topologySpreadConstraints:
- maxSkew: 1
topologyKey: my-node-label
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/role: pgbouncer
If you want to ensure that your PgBouncer instances are deployed more evenly (or not deployed at all), you need to update whenUnsatisfiable
to DoNotSchedule
.
Next Steps
Now that we can enable connection pooling in a cluster, let’s explore some administrative tasks such as manually restarting PostgreSQL using PGO. How do we do that?