Connection Pooling

Connection pooling can be helpful for scaling and maintaining overall availability between your application and the database. PGO helps facilitate this by supporting the PgBouncer connection pooler and state manager.

Let's look at how we can add a connection pooler and connect it to our application!

Adding a Connection Pooler

We will explore adding a connection pooler using the kustomize/keycloak example in the Postgres Operator examples repository.

Connection poolers are added using the spec.proxy section of the custom resource. Currently, the only connection pooler supported is PgBouncer.

The only required attribute for adding a PgBouncer connection pooler is to set the spec.proxy.pgBouncer.image attribute. In the kustomize/keycloak/postgres.yaml file, add the following YAML to the spec:

proxy:
  pgBouncer:
    image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.22-4

(You can also find an example of this in the kustomize/examples/high-availability example).

Save your changes and run:

kubectl apply -k kustomize/keycloak

PGO will detect the change and create a new PgBouncer Deployment!

That was fairly easy to set up, so now let's look at how we can connect our application to the connection pooler.

Connecting to a Connection Pooler

When a connection pooler is deployed to the cluster, PGO adds additional information to the user Secrets to allow for applications to connect directly to the connection pooler. Recall that in this example, our user Secret is called keycloakdb-pguser-keycloakdb. Describe the user Secret:

kubectl -n postgres-operator describe secrets keycloakdb-pguser-keycloakdb

You should see that there are several new attributes included in this Secret that allow for you to connect to your Postgres instance via the connection pooler:

  • pgbouncer-host: The name of the host of the PgBouncer connection pooler. This references the Service of the PgBouncer connection pooler.
  • pgbouncer-port: The port that the PgBouncer connection pooler is listening on.
  • pgbouncer-uri: A PostgreSQL connection URI that provides all the information for logging into the Postgres database via the PgBouncer connection pooler.
  • pgbouncer-jdbc-uri: A PostgreSQL JDBC connection URI that provides all the information for logging into the Postgres database via the PgBouncer connection pooler using the JDBC driver. Note that by default, the connection string disables JDBC managing prepared transactions for optimal use with PgBouncer.

Open up the file in kustomize/keycloak/keycloak.yaml. Update the DB_ADDR and DB_PORT values to be the following:

- name: DB_ADDR
  valueFrom: { secretKeyRef: { name: keycloakdb-pguser-keycloakdb, key: pgbouncer-host } }
- name: DB_PORT
  valueFrom: { secretKeyRef: { name: keycloakdb-pguser-keycloakdb, key: pgbouncer-port } }

This changes Keycloak's configuration so that it will now connect through the connection pooler.

Apply the changes:

kubectl apply -k kustomize/keycloak

Kubernetes will detect the changes and begin to deploy a new Keycloak Pod. When it is completed, Keycloak will now be connected to Postgres via the PgBouncer connection pooler!

TLS

PGO deploys every cluster and component over TLS. This includes the PgBouncer connection pooler. If you are using your own custom TLS setup, you will need to provide a Secret reference for a TLS key / certificate pair for PgBouncer in spec.proxy.pgBouncer.customTLSSecret.

Your TLS certificate for PgBouncer should have a Common Name (CN) setting that matches the PgBouncer Service name. This is the name of the cluster suffixed with -pgbouncer. For example, for our hippo cluster this would be hippo-pgbouncer. For the keycloakdb example, it would be keycloakdb-pgbouncer.

To customize the TLS for PgBouncer, you will need to create a Secret in the Namespace of your Postgres cluster that contains the TLS key (tls.key), TLS certificate (tls.crt) and the CA certificate (ca.crt) to use. The Secret should contain the following values:

data:
  ca.crt: $VALUE
  tls.crt: $VALUE
  tls.key: $VALUE

For example, if you have files named ca.crt, keycloakdb-pgbouncer.key, and keycloakdb-pgbouncer.crt stored on your local machine, you could run the following command:

kubectl create secret generic -n postgres-operator keycloakdb-pgbouncer.tls \
  --from-file=ca.crt=ca.crt \
  --from-file=tls.key=keycloakdb-pgbouncer.key \
  --from-file=tls.crt=keycloakdb-pgbouncer.crt

You can specify the custom TLS Secret in the spec.proxy.pgBouncer.customTLSSecret.name field in your postgrescluster.postgres-operator.crunchydata.com custom resource, e.g.:

spec:
  proxy:
    pgBouncer:
      customTLSSecret:
        name: keycloakdb-pgbouncer.tls

Customizing

The PgBouncer connection pooler is highly customizable, both from a configuration and Kubernetes deployment standpoint. Let's explore some of the customizations that you can do!

Configuration

PgBouncer configuration can be customized through spec.proxy.pgBouncer.config. After making configuration changes, PGO will roll them out to any PgBouncer instance and automatically issue a "reload".

There are several ways you can customize the configuration:

  • spec.proxy.pgBouncer.config.global: Accepts key-value pairs that apply changes globally to PgBouncer.
  • spec.proxy.pgBouncer.config.databases: Accepts key-value pairs that represent PgBouncer database definitions.
  • spec.proxy.pgBouncer.config.users: Accepts key-value pairs that represent connection settings applied to specific users.
  • spec.proxy.pgBouncer.config.files: Accepts a list of files that are mounted in the /etc/pgbouncer directory and loaded before any other options are considered using PgBouncer's include directive.

For example, to set the connection pool mode to transaction, you would set the following configuration:

spec:
  proxy:
    pgBouncer:
      config:
        global:
          pool_mode: transaction

For a reference on PgBouncer configuration please see:

https://www.pgbouncer.org/config.html

Replicas

PGO deploys one PgBouncer instance by default. You may want to run multiple PgBouncer instances to have some level of redundancy, though you still want to be mindful of how many connections are going to your Postgres database!

You can manage the number of PgBouncer instances that are deployed through the spec.proxy.pgBouncer.replicas attribute.

Resources

You can manage the CPU and memory resources given to a PgBouncer instance through the spec.proxy.pgBouncer.resources attribute. The layout of spec.proxy.pgBouncer.resources should be familiar: it follows the same pattern as the standard Kubernetes structure for setting container resources.

For example, let's say we want to set some CPU and memory limits on our PgBouncer instances. We could add the following configuration:

spec:
  proxy:
    pgBouncer:
      resources:
        limits:
          cpu: 200m
          memory: 128Mi

As PGO deploys the PgBouncer instances using a Deployment these changes are rolled out using a rolling update to minimize disruption between your application and Postgres instances!

Annotations / Labels

You can apply custom annotations and labels to your PgBouncer instances through the spec.proxy.pgBouncer.metadata.annotations and spec.proxy.pgBouncer.metadata.labels attributes respectively. Note that any changes to either of these two attributes take precedence over any other custom labels you have added.

Pod Anti-Affinity / Pod Affinity / Node Affinity

You can control the pod anti-affinity, pod affinity, and node affinity through the spec.proxy.pgBouncer.affinity attribute, specifically:

  • spec.proxy.pgBouncer.affinity.nodeAffinity: controls node affinity for the PgBouncer instances.
  • spec.proxy.pgBouncer.affinity.podAffinity: controls Pod affinity for the PgBouncer instances.
  • spec.proxy.pgBouncer.affinity.podAntiAffinity: controls Pod anti-affinity for the PgBouncer instances.

Each of the above follows the standard Kubernetes specification for setting affinity.

For example, to set a preferred Pod anti-affinity rule for the kustomize/keycloak example, you would want to add the following to your configuration:

spec:
  proxy:
    pgBouncer:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  postgres-operator.crunchydata.com/cluster: keycloakdb
                  postgres-operator.crunchydata.com/role: pgbouncer
              topologyKey: kubernetes.io/hostname

Tolerations

You can deploy PgBouncer instances to Nodes with Taints by setting Tolerations through spec.proxy.pgBouncer.tolerations. This attribute follows the Kubernetes standard tolerations layout.

For example, if there were a set of Nodes with a Taint of role=connection-poolers:NoSchedule that you want to schedule your PgBouncer instances to, you could apply the following configuration:

spec:
  proxy:
    pgBouncer:
      tolerations:
      - effect: NoSchedule
        key: role
        operator: Equal
        value: connection-poolers

Note that setting a toleration does not necessarily mean that the PgBouncer instances will be assigned to Nodes with those taints. Tolerations act as a key: they allow for you to access Nodes. If you want to ensure that your PgBouncer instances are deployed to specific nodes, you need to combine setting tolerations with node affinity.

Pod Spread Constraints

Besides using affinity, anti-affinity and tolerations, you can also set Topology Spread Constraints through spec.proxy.pgBouncer.topologySpreadConstraints. This attribute follows the Kubernetes standard topology spread contraint layout.

For example, since each of of our pgBouncer Pods will have the standard postgres-operator.crunchydata.com/role: pgbouncer Label set, we can use this Label when determining the maxSkew. In the example below, since we have 3 nodes with a maxSkew of 1 and we've set whenUnsatisfiable to ScheduleAnyway, we should ideally see 1 Pod on each of the nodes, but our Pods can be distributed less evenly if other constraints keep this from happening.

proxy:
    pgBouncer:
      replicas: 3
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: my-node-label
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              postgres-operator.crunchydata.com/role: pgbouncer

If you want to ensure that your PgBouncer instances are deployed more evenly (or not deployed at all), you need to update whenUnsatisfiable to DoNotSchedule.

Administration

PgBouncer provides an admin console that can be utilized to obtain connection pooler statistics, and to control the PgBouncer instance via various process control commands.

To access the admin console, you will need to configure the PgBouncer admin_users setting as follows:

proxy:
  pgBouncer:
    config:
      global:
        admin_users: _crunchypgbouncer

With this setting in place, the _crunchypgbouncer system account will now be allowed to connect to the PgBouncer admin console.

To obtain the password for the _crunchypgbouncer user, you can leverage the <clusterName>-pgbouncer Secret.

$ kubectl get secret hippo-pgbouncer --template='{{index .data "pgbouncer-password" | base64decode}}'

For Powershell environments, you need to escape the double quotes around "pgbouncer-password":

$ kubectl get secret hippo-pgbouncer --template='{{index .data \"pgbouncer-password\" | base64decode }}'

And from there you can connect to the pgbouncer database, and therefore access the PgBouncer admin console:

$ psql -U _crunchypgbouncer -h hippo-pgbouncer.postgres-operator.svc pgbouncer
Password for user _crunchypgbouncer: 
psql (16.3, server 1.22.1/bouncer)
WARNING: psql major version 16, server major version 1.22.
         Some psql features might not work.
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

pgbouncer=# SHOW HELP;
NOTICE:  Console usage
DETAIL:  
	SHOW HELP|CONFIG|DATABASES|POOLS|CLIENTS|SERVERS|USERS|VERSION
	SHOW PEERS|PEER_POOLS
	SHOW FDS|SOCKETS|ACTIVE_SOCKETS|LISTS|MEM|STATE
	SHOW DNS_HOSTS|DNS_ZONES
	SHOW STATS|STATS_TOTALS|STATS_AVERAGES|TOTALS
	SET key = arg
	RELOAD
	PAUSE [<db>]
	RESUME [<db>]
	DISABLE <db>
	ENABLE <db>
	RECONNECT [<db>]
	KILL <db>
	SUSPEND
	SHUTDOWN
	WAIT_CLOSE [<db>]
SHOW

See the PgBouncer Admin Console documentation for additional details about available statistics and process control commands.

Next Steps

Now that we can enable connection pooling in a cluster, let’s explore some ways that we can manage users and databases in our Postgres cluster using PGO.