Connection Pooling

Connection pooling can be helpful for scaling and maintaining overall availability between your application and the database. PGO helps facilitate this by supporting the PgBouncer connection pooler and state manager.

Let’s look at how we can a connection pooler and connect it to our application!

Adding a Connection Pooler

Let’s look at how we can add a connection pooler using the kustomize/keycloak example in the Postgres Operator examples repository.

Connection poolers are added using the spec.proxy section of the custom resource. Currently, the only connection pooler supported is PgBouncer.

The only required attribute for adding a PgBouncer connection pooler is to set the spec.proxy.pgBouncer.image attribute. In the kustomize/keycloak/postgres.yaml file, add the following YAML to the spec:

proxy:
  pgBouncer:
    image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.16-2

(You can also find an example of this in the kustomize/examples/high-availability example).

Save your changes and run:

kubectl apply -k kustomize/keycloak

PGO will detect the change and create a new PgBouncer Deployment!

That was fairly easy to set up, so now let’s look at how we can connect our application to the connection pooler.

Connecting to a Connection Pooler

When a connection pooler is deployed to the cluster, PGO adds additional information to the user Secrets to allow for applications to connect directly to the connection pooler. Recall that in this example, our user Secret is called keycloakdb-pguser-keycloakdb. Describe the user Secret:

kubectl -n postgres-operator describe secrets keycloakdb-pguser-keycloakdb

You should see that there are several new attributes included in this Secret that allow for you to connect to your Postgres instance via the connection pooler:

  • pgbouncer-host: The name of the host of the PgBouncer connection pooler. This references the Service of the PgBouncer connection pooler.
  • pgbouncer-port: The port that the PgBouncer connection pooler is listening on.
  • pgbouncer-uri: A PostgreSQL connection URI that provides all the information for logging into the Postgres database via the PgBouncer connection pooler.
  • pgbouncer-jdbc-uri: A PostgreSQL JDBC connection URI that provides all the information for logging into the Postgres database via the PgBouncer connection pooler using the JDBC driver. Note that by default, the connection string disable JDBC managing prepared transactions for optimal use with PgBouncer.

Open up the file in kustomize/keycloak/keycloak.yaml. Update the DB_ADDR and DB_PORT values to be the following:

- name: DB_ADDR
  valueFrom: { secretKeyRef: { name: keycloakdb-pguser-keycloakdb, key: pgbouncer-host } }
- name: DB_PORT
  valueFrom: { secretKeyRef: { name: keycloakdb-pguser-keycloakdb, key: pgbouncer-port } }

This changes Keycloak’s configuration so that it will now connect through the connection pooler.

Apply the changes:

kubectl apply -k kustomize/keycloak

Kubernetes will detect the changes and begin to deploy a new Keycloak Pod. When it is completed, Keycloak will now be connected to Postgres via the PgBouncer connection pooler!

TLS

PGO deploys every cluster and component over TLS. This includes the PgBouncer connection pooler. If you are using your own custom TLS setup, you will need to provide a Secret reference for a TLS key / certificate pair for PgBouncer in spec.proxy.pgBouncer.customTLSSecret.

Your TLS certificate for PgBouncer should have a Common Name (CN) setting that matches the PgBouncer Service name. This is the name of the cluster suffixed with -pgbouncer. For example, for our hippo cluster this would be hippo-pgbouncer. For the keycloakdb example, it would be keycloakdb-pgbouncer.

To customize the TLS for PgBouncer, you will need to create a Secret in the Namespace of your Postgres cluster that contains the TLS key (tls.key), TLS certificate (tls.crt) and the CA certificate (ca.crt) to use. The Secret should contain the following values:

data:
  ca.crt: <value>
  tls.crt: <value>
  tls.key: <value>

For example, if you have files named ca.crt, keycloakdb-pgbouncer.key, and keycloakdb-pgbouncer.crt stored on your local machine, you could run the following command:

kubectl create secret generic -n postgres-operator keycloakdb-pgbouncer.tls \
  --from-file=ca.crt=ca.crt \
  --from-file=tls.key=keycloakdb-pgbouncer.key \
  --from-file=tls.crt=keycloakdb-pgbouncer.crt

You can specify the custom TLS Secret in the spec.proxy.pgBouncer.customTLSSecret.name field in your postgrescluster.postgres-operator.crunchydata.com custom resource, e.g.:

spec:
  proxy:
    pgBouncer:
      customTLSSecret:
        name: keycloakdb-pgbouncer.tls

Customizing

The PgBouncer connection pooler is highly customizable, both from a configuration and Kubernetes deployment standpoint. Let’s explore some of the customizations that you can do!

Configuration

PgBouncer configuration can be customized through spec.proxy.pgBouncer.config. After making configuration changes, PGO will roll them out to any PgBouncer instance and automatically issue a “reload”.

There are several ways you can customize the configuration:

  • spec.proxy.pgBouncer.config.global: Accepts key-value pairs that apply changes globally to PgBouncer.
  • spec.proxy.pgBouncer.config.databases: Accepts key-value pairs that represent PgBouncer database definitions.
  • spec.proxy.pgBouncer.config.users: Accepts key-value pairs that represent connection settings applied to specific users.
  • spec.proxy.pgBouncer.config.files: Accepts a list of files that are mounted in the /etc/pgbouncer directory and loaded before any other options are considered using PgBouncer’s include directive.

For example, to set the connection pool mode to transaction, you would set the following configuration:

spec:
  proxy:
    pgBouncer:
      config:
        global:
          pool_mode: transaction

For a reference on PgBouncer configuration please see:

https://www.pgbouncer.org/config.html

Replicas

PGO deploys one PgBouncer instance by default. You may want to run multiple PgBouncer instances to have some level of redundancy, though you still want to be mindful of how many connections are going to your Postgres database!

You can manage the number of PgBouncer instances that are deployed through the spec.proxy.pgBouncer.replicas attribute.

Resources

You can manage the CPU and memory resources given to a PgBouncer instance through the spec.proxy.pgBouncer.resources attribute. The layout of spec.proxy.pgBouncer.resources should be familiar: it follows the same pattern as the standard Kubernetes structure for setting container resources.

For example, let’s say we want to set some CPU and memory limits on our PgBouncer instances. We could add the following configuration:

spec:
  proxy:
    pgBouncer:
      resources:
        limits:
          cpu: 200m
          memory: 128Mi

As PGO deploys the PgBouncer instances using a Deployment these changes are rolled out using a rolling update to minimize disruption between your application and Postgres instances!

Annotations / Labels

You can apply custom annotations and labels to your PgBouncer instances through the spec.proxy.pgBouncer.metadata.annotations and spec.proxy.pgBouncer.metadata.labels attributes respectively. Note that any changes to either of these two attributes take precedence over any other custom labels you have added.

Pod Anti-Affinity / Pod Affinity / Node Affinity

You can control the pod anti-affinity, pod affinity, and node affinity through the spec.proxy.pgBouncer.affinity attribute, specifically:

  • spec.proxy.pgBouncer.affinity.nodeAffinity: controls node affinity for the PgBouncer instances.
  • spec.proxy.pgBouncer.affinity.podAffinity: controls Pod affinity for the PgBouncer instances.
  • spec.proxy.pgBouncer.affinity.podAntiAffinity: controls Pod anti-affinity for the PgBouncer instances.

Each of the above follows the standard Kubernetes specification for setting affinity.

For example, to set a preferred Pod anti-affinity rule for the kustomize/keycloak example, you would want to add the following to your configuration:

spec:
  proxy:
    pgBouncer:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  postgres-operator.crunchydata.com/cluster: keycloakdb
                  postgres-operator.crunchydata.com/role: pgbouncer
              topologyKey: kubernetes.io/hostname

Tolerations

You can deploy PgBouncer instances to Nodes with Taints by setting Tolerations through spec.proxy.pgBouncer.tolerations. This attribute follows the Kubernetes standard tolerations layout.

For example, if there were a set of Nodes with a Taint of role=connection-poolers:NoSchedule that you want to schedule your PgBouncer instances to, you could apply the following configuration:

spec:
  proxy:
    pgBouncer:
      tolerations:
      - effect: NoSchedule
        key: role
        operator: Equal
        value: connection-poolers

Note that setting a toleration does not necessarily mean that the PgBouncer instances will be assigned to Nodes with those taints. Tolerations act as a key: they allow for you to access Nodes. If you want to ensure that your PgBouncer instances are deployed to specific nodes, you need to combine setting tolerations with node affinity.

Pod Spread Constraints

Besides using affinity, anti-affinity and tolerations, you can also set Topology Spread Constraints through spec.proxy.pgBouncer.topologySpreadConstraints. This attribute follows the Kubernetes standard topology spread contraint layout.

For example, since each of of our pgBouncer Pods will have the standard postgres-operator.crunchydata.com/role: pgbouncer Label set, we can use this Label when determining the maxSkew. In the example below, since we have 3 nodes with a maxSkew of 1 and we’ve set whenUnsatisfiable to ScheduleAnyway, we should ideally see 1 Pod on each of the nodes, but our Pods can be distributed less evenly if other constraints keep this from happening.

  proxy:
    pgBouncer:
      replicas: 3
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: my-node-label
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              postgres-operator.crunchydata.com/role: pgbouncer

If you want to ensure that your PgBouncer instances are deployed more evenly (or not deployed at all), you need to update whenUnsatisfiable to DoNotSchedule.

Next Steps

Now that we can enable connection pooling in a cluster, let’s explore some administrative tasks such as manually restarting PostgreSQL using PGO. How do we do that?