OpenTelemetry Logging

For an overview of the full observability architecture within CPK, including details about the architecture for OpenTelemetry logging, please see the Database Observability Architecture page. This section will provide steps for enabling OpenTelemetry logging, along with examples for configuring your PostgresClusters to send logging data to a variety of different OpenTelemetry-compatible services and backends.

Enabling OpenTelemetry Logging

In order to use OpenTelemetry logging, the OpenTelemetryLogs feature gate must first be enabled in your CPK installation. Please see the Feature Gate Installation Guide for guidance on how to properly enable this feature gate within your installation.

Once the feature gate is enabled, you will be able to create PostgresClusters and PGAdmins with OpenTelemetry logging. To do that, add an instrumentation block to your PostgresCluster or PGAdmin spec, like so, for a default-only configuration:

spec:
  instrumentation: {}

Once applied, you will see OpenTelemetry collector sidecars deployed alongside the various components comprising your PostgresCluster and/or PGAdmin. Additionally, CPK will automatically configure the various components within your PostgresCluster and/or PGAdmin for file-base logging.

Configuration Defaults for OpenTelemetry Logging

When OpenTelemetry logging is enabled, CPK ensures certain logging configurations are set and changes some of the default behavior for the associated components. For instance, to process logs correctly, each component is configured to log to files.

Postgres

When OpenTelemetryLogs is enabled for a PostgresCluster, the following configurations are set by CPK:

logging_collector = 'on'
log_directory = '/pgdata/logs/postgres'
log_destination = 'jsonlog' # Set for Postgres 16 and higher, but set to 'csvlog' for 15 and lower
log_rotation_size = '0'
log_truncate_on_rotation = 'on'
log_timezone = 'UTC'

CPK sets those parameters and will override any user attempt to set them.

NOTE: By turning on OpenTelemetry logging, the location of the postgres logs will move from /pgdata/pg##/log to /pgdata/logs/postgres.

pgBackRest

In order to parse the logs correctly, the OpenTelemetry collector expects the default timestamp format for pgBackRest. To make sure that the timestamp is in the correct form, you shouldn't disable or adjust the timestamp by setting the no-log-timestamp or log-timestamp=n configurations.

Note: When pgBackRest logging is turned up to debug or higher, Kubernetes may rotate the files storing OTel collector console output, obscuring it from kubectl logs. This may result in kubectl logs returning a blank result from the collector container until another pgBackRest process (such as a manual backup) is triggered.

pgBouncer

When OpenTelemetryLogs is enabled for a PostgresCluster with pgBouncer, CPK sets the logfile:

logfile = "/tmp/pgbouncer.log"

Note that pgBouncer logs to a file in the ephemeral /tmp directory, so any restart of the pgBouncer pod will wipe out previous logs.

pgAdmin4

When OpenTelemetryLogs is enabled for a pgAdmin custom resource with a spec.instrumentation block, CPK makes the following adjustments to the pgAdmin configuration:

DATA_DIR = '/var/lib/pgadmin'
LOG_FILE = '/var/lib/pgadmin/logs/pgadmin.log'
JSON_LOGGER = True
CONSOLE_LOG_LEVEL = logging.WARNING
FILE_LOG_LEVEL = logging.INFO
FILE_LOG_FORMAT_JSON = {'time': 'created', 'name': 'name', 'level': 'levelname', 'message': 'message'}

CPK also makes changes to the Gunicorn settings to ensure that it logs to file at /var/lib/pgadmin/logs/gunicorn.log, and, like pgAdmin above, logs in json format.

NOTE: Only pgAdmins that are deployed using the PGAdmin custom resource can use the OpenTelemetry features. Logs from pgAdmins that are deployed via the PostgresCluster's spec.userInterface configuration will not be collected.

Log Rotation & Retention

When OpenTelemetry logging is enabled for a PostgresCluster or PGAdmin, CPK will ensure that the components log to file. To ensure that these log files don't become unmanageable, CPK also manages log rotation and retention where possible.

To configure log retention for your PostgresCluster or PGAdmin, fill in the spec.instrumentation.logs.retentionPeriod field on your spec:

spec:
  instrumentation:
    logs:
      retentionPeriod: 2d

This retentionPeriod field can be an RFC 3339 duration or a number and unit; the minimum unit is an hour, and the maximum unit is a week.

The different components of a PostgresCluster or PGAdmin manage their rotation differently, so this setting is approximate. CPK will always retain at least the specified amount, but sometimes more will be retained.

NOTE: Patroni logs are not rotated by age, but by size. This can be set independently in the spec.patroni.logging.storageLimit field. If that field is left blank, CPK will default to 25M, which is the minimum limit for Patroni log storage. See our guide to customizing Postgres instance logs for more detail on this field.

Batch Size

In between the collection of logs via the receiver components and the exporting of the logs via the exporter components, OpenTelemetry allows for transformation of the data via "processor" components. One of the processors that we use in our implementation is the Batch Processor, which compresses the data and reduces the number of network connections needed to export the data.

The size of the batches and how often they are sent is determined by three different settings:

  • maxDelay - The maximum time to wait before exporting a batch, regardless of the batch's size. Higher numbers allow more records to be deduplicated and compressed before export.
  • maxRecords - The maximum number of records to include in an exported batch. When present, batches this size are sent without any further delay.
  • minRecords - The number of records to wait for before exporting a batch. Higher numbers allow more records to be deduplicated and compressed before export.

By default, maxRecords is not set and the other two settings are configured as such:

  • maxDelay = 200ms
  • minRecords = 8192

You can configure these settings to your liking via the spec.instrumentation.logs.batches section. For example:

spec:
  instrumentation:
    logs:
      batches:
        maxDelay: 1s
        maxRecords: 16384
        minRecords: 8192

If you wish to turn batching off entirely, you must set both maxDelay and minRecords to zero:

spec:
  instrumentation:
    logs:
      batches:
        maxDelay: 0s
        minRecords: 0

Resource Detection

Another processor that we incorporate into our logs pipelines is the Resource Detection Processor, which can detect resource information from the host and add it as metadata to log records. The full list of supported detectors can be found in the processor's documentation.

You can configure one or more detectors via the spec.instrumentation.config.detectors array, where each entry has a name field that indicates which detector to use, and an optional attributes field, where you can specify particular attributes that you wish to turn on or off. For example, if you were running CPK in an Azure Kubernetes Service cluster, you might configure this section like so:

spec:
  instrumentation:
    config:
      detectors:
        - name: aks
          attributes:
            k8s.cluster.name: true

See the Resource Detection Processor documentation for more details on the various detectors and their particular attributes.

Configuring Exporters

For documentation on configuring exporters, please see the OpenTelemetry Exporters section.