OpenTelemetry Logging

For an overview of the full observability architecture within CPK, including details about the architecture for OpenTelemetry logging, please see the Database Observability Architecture page. This section will provide steps for enabling OpenTelemetry logging, along with examples for configuring your PostgresClusters to send logging data to a variety of different OpenTelemetry-compatible services and backends.

Enabling OpenTelemetry Logging

In order to use OpenTelemetry logging, the OpenTelemetryLogs feature gate must first be enabled in your CPK installation. Please see the Feature Gate Installation Guide for guidance on how to properly enable this feature gate within your installation.

Once the feature gate is enabled, you will be able to create PostgresClusters and PGAdmins with OpenTelemetry logging. To do that, add an instrumentation block to your PostgresCluster or PGAdmin spec, like so, for a default-only configuration:

spec:
  instrumentation: {}

Once applied, you will see OpenTelemetry collector sidecars deployed alongside the various components comprising your PostgresCluster and/or PGAdmin. Additionally, CPK will automatically configure the various components within your PostgresCluster and/or PGAdmin for file-base logging.

Configuration Defaults for OpenTelemetry Logging

When OpenTelemetry logging is enabled, CPK ensures certain logging configurations are set and changes some of the default behavior for the associated components. For instance, to process logs correctly, each component is configured to log to files.

Postgres

When OpenTelemetryLogs is enabled for a PostgresCluster, the following configurations are set by CPK:

logging_collector = 'on'
log_directory = '/pgdata/logs/postgres'
log_destination = 'jsonlog' # Set for Postgres 16 and higher, but set to 'csvlog' for 15 and lower
log_rotation_size = '0'
log_truncate_on_rotation = 'on'
log_timezone = 'UTC'

CPK sets those parameters and will override any user attempt to set them.

NOTE: By turning on OpenTelemetry logging, the location of the postgres logs will move from /pgdata/pg##/log to /pgdata/logs/postgres.

pgBackRest

In order to parse the logs correctly, the OpenTelemetry collector expects the default timestamp format for pgBackRest. To make sure that the timestamp is in the correct form, you shouldn't disable or adjust the timestamp by setting the no-log-timestamp or log-timestamp=n configurations.

pgBouncer

When OpenTelemetryLogs is enabled for a PostgresCluster with pgBouncer, CPK sets the logfile:

logfile = "/tmp/pgbouncer.log"

Note that pgBouncer logs to a file in the ephemeral /tmp directory, so any restart of the pgBouncer pod will wipe out previous logs.

pgAdmin4

When OpenTelemetryLogs is enabled for a pgAdmin custom resource with a spec.instrumentation block, CPK makes the following adjustments to the pgAdmin configuration:

DATA_DIR = '/var/lib/pgadmin'
LOG_FILE = '/var/lib/pgadmin/logs/pgadmin.log'
JSON_LOGGER = True
CONSOLE_LOG_LEVEL = logging.WARNING
FILE_LOG_LEVEL = logging.INFO
FILE_LOG_FORMAT_JSON = {'time': 'created', 'name': 'name', 'level': 'levelname', 'message': 'message'}

CPK also makes changes to the Gunicorn settings to ensure that it logs to file at /var/lib/pgadmin/logs/gunicorn.log, and, like pgAdmin above, logs in json format.

NOTE: Only pgAdmins that are deployed using the PGAdmin custom resource can use the OpenTelemetry features. Logs from pgAdmins that are deployed via the PostgresCluster's spec.userInterface configuration will not be collected.

Log Rotation & Retention

When OpenTelemetry logging is enabled for a PostgresCluster or PGAdmin, CPK will ensure that the components log to file. To ensure that these log files don't become unmanageable, CPK also manages log rotation and retention where possible.

To configure log retention for your PostgresCluster or PGAdmin, fill in the spec.instrumentation.logs.retentionPeriod field on your spec:

spec:
  instrumentation:
    logs:
      retentionPeriod: 2d

This retentionPeriod field can be an RFC 3339 duration or a number and unit; the minimum unit is an hour, and the maximum unit is a week.

The different components of a PostgresCluster or PGAdmin manage their rotation differently, so this setting is approximate. CPK will always retain at least the specified amount, but sometimes more will be retained.

NOTE: Patroni logs are not rotated by age, but by size. This can be set independently in the spec.patroni.logging.storageLimit field. If that field is left blank, CPK will default to 25M, which is the minimum limit for Patroni log storage. See our guide to customizing Postgres instance logs for more detail on this field.

Configuring Exporters

When you first turn on OpenTelemetry logging in CPK with no additional configuration, the logs that are collected are sent to the Debug Exporter, which outputs the logs to the console. Since the collector is running in a sidecar container in a Kubernetes Pod, that console output is added to the container logs which you can retrieve with the kubectl logs command. If you were running a logging-enabled PostgresCluster named hippo in the postgres-operator namespace and wanted to see your postgres, patroni, and pgbackrest logs from the primary instance pod, the commands to retrieve those logs would look like this:

PG_CLUSTER_PRIMARY_POD=$(kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo,postgres-operator.crunchydata.com/role=master)
kubectl -n postgres-operator logs "${PG_CLUSTER_PRIMARY_POD}" -c collector

However, this output is not the easiest to read and is not well organized or easily filtered or searched. You will therefore almost certainly want to export your logs to a dedicated backend or service of some kind where you can more easily search and read through your logs.

Luckily, the OpenTelemetry Collector that we use has a plethora of exporters built into it that should satisfy most needs.

To use an exporter, you define it in the instrumentation.config.exporters section. Fields in this section should follow the type[/name] "component identifier" format where the type is the exporter you want to use and name is the optional name you want to give this configuration. The optional name allows you to define multiple configurations of the same exporter type. For example, you could have two configurations of the otlp exporter where one is called otlp and the second is called otlp/2. The value for each field is the configuration for that exporter. For example:

spec:
  instrumentation:
    config:
      exporters:
        otlp:
          endpoint: a-standalone-collector:4317
          tls:
            insecure: true
        otlp/2:
          endpoint: another-collector:4317

The configuration you define will differ depending on the exporter you are using. Please follow the documentation for your chosen exporter to determine what configuration to provide. OpenTelemetry keeps a list of exporters that are specific to the "contrib" collector, along with their documentation. There are also exporters that have their documentation in the base collector repo, but are also available in the "contrib" collector that we use.

Once the exporter is configured, you lastly need to tell the collector to use the exporter in the logging pipeline by adding the name of the exporter to the instrumentation.logs.exporters array. For example:

spec:
  instrumentation:
    config:
      exporters:
        otlp/1:
          endpoint: a-standalone-collector:4317
          tls:
            insecure: true
    logs:
      exporters: ['otlp/1']

You will find more examples of exporter configurations for commonly used logging backends in the Example Exporter Configurations section below.

Files

Some exporters might require that configuration be provided via files, such as separate config files, certificates, etc. This can be done via the instrumentation.config.files section, which allows you to project files that are in Kubernetes Secrets, ConfigMaps, etc., into the volume that is mounted into the collector container. For example, creating a Secret with the following command:

kubectl -n postgres-operator create secret tls some-otel-exporter-certs --cert=server.crt --key=server.key

And then adding the following to your instrumentation spec:

spec:
  instrumentation:
    config:
      files:
        - secret:
            name: some-otel-exporter-certs

Will result in the server.crt and server.key files being mounted in the /etc/otel-collector directory of the collector container.

Batch Size

In between the collection of logs via the receiver components and the exporting of the logs via the exporter components, OpenTelemetry allows for transformation of the data via "processor" components. One of the processors that we use in our implementation is the Batch Processor, which compresses the data and reduces the number of network connections needed to export the data.

The size of the batches and how often they are sent is determined by three different settings:

  • maxDelay - The maximum time to wait before exporting a batch, regardless of the batch's size. Higher numbers allow more records to be deduplicated and compressed before export.
  • maxRecords - The maximum number of records to include in an exported batch. When present, batches this size are sent without any further delay.
  • minRecords - The number of records to wait for before exporting a batch. Higher numbers allow more records to be deduplicated and compressed before export.

By default, maxRecords is not set and the other two settings are configured as such:

  • maxDelay = 200ms
  • minRecords = 8192

You can configure these settings to your liking via the spec.instrumentation.logs.batches section. For example:

spec:
  instrumentation:
    logs:
      batches:
        maxDelay: 1s
        maxRecords: 16384
        minRecords: 8192

If you wish to turn batching off entirely, you must set both maxDelay and minRecords to zero:

spec:
  instrumentation:
    logs:
      batches:
        maxDelay: 0s
        minRecords: 0

Resource Detection

Another processor that we incorporate into our logs pipelines is the Resource Detection Processor, which can detect resource information from the host and add it as metadata to log records. The full list of supported detectors can be found in the processor's documentation.

You can configure one or more detectors via the spec.instrumentation.config.detectors array, where each entry has a name field that indicates which detector to use, and an optional attributes field, where you can specify particular attributes that you wish to turn on or off. For example, if you were running CPK in an Azure Kubernetes Service cluster, you might configure this section like so:

spec:
  instrumentation:
    config:
      detectors:
        - name: aks
          attributes:
            k8s.cluster.name: true

See the Resource Detection Processor documentation for more details on the various detectors and their particular attributes.

Example Exporter Configurations

This section provides example configurations for a variety of different OpenTelemetry-compatible logging services and backends.

Google Cloud

apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: otel-hippo
  namespace: postgres-operator
spec:
  instrumentation:
    config:
      detectors:
        - name: gcp
      exporters:
        # https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/googlecloudexporter#configuration-reference
        googlecloud: 
          log:
            default_log_name: "collector-exported-log"
            resource_filters:
              - prefix: "k8s"
              - prefix: "db"
    logs:
      exporters: ['googlecloud']

OTLP

apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: otel-hippo
  namespace: postgres-operator
spec:
  instrumentation:
    config:
      exporters:
        # https://github.com/open-telemetry/opentelemetry-collector/tree/main/exporter/otlpexporter#getting-started
        otlp: # for exporting to another collector
          endpoint: "otel-collector:4317"
          tls:
            insecure: true
    logs:
      exporters: ['otlp']