OpenTelemetry Logging
For an overview of the full observability architecture within CPK, including details about the architecture for OpenTelemetry logging, please see the Database Observability Architecture page. This section will provide steps for enabling OpenTelemetry logging, along with examples for configuring your PostgresClusters to send logging data to a variety of different OpenTelemetry-compatible services and backends.
Enabling OpenTelemetry Logging
In order to use OpenTelemetry logging, the OpenTelemetryLogs
feature gate must first be enabled in your CPK installation.
Please see the Feature Gate Installation Guide for guidance on how to
properly enable this feature gate within your installation.
Once the feature gate is enabled, you will be able to create PostgresClusters and PGAdmins with OpenTelemetry logging. To do that,
add an instrumentation
block to your PostgresCluster or PGAdmin spec, like so, for a default-only configuration:
spec:
instrumentation: {}
Once applied, you will see OpenTelemetry collector sidecars deployed alongside the various components comprising your PostgresCluster and/or PGAdmin. Additionally, CPK will automatically configure the various components within your PostgresCluster and/or PGAdmin for file-base logging.
Configuration Defaults for OpenTelemetry Logging
When OpenTelemetry logging is enabled, CPK ensures certain logging configurations are set and changes some of the default behavior for the associated components. For instance, to process logs correctly, each component is configured to log to files.
Postgres
When OpenTelemetryLogs
is enabled for a PostgresCluster, the following configurations are set by CPK:
logging_collector = 'on'
log_directory = '/pgdata/logs/postgres'
log_destination = 'jsonlog' # Set for Postgres 16 and higher, but set to 'csvlog' for 15 and lower
log_rotation_size = '0'
log_truncate_on_rotation = 'on'
log_timezone = 'UTC'
CPK sets those parameters and will override any user attempt to set them.
NOTE: By turning on OpenTelemetry logging, the location of the postgres logs will move from /pgdata/pg##/log
to /pgdata/logs/postgres
.
pgBackRest
In order to parse the logs correctly, the OpenTelemetry collector expects the default timestamp format for pgBackRest.
To make sure that the timestamp is in the correct form, you shouldn't disable or adjust the timestamp by setting the no-log-timestamp
or log-timestamp=n
configurations.
pgBouncer
When OpenTelemetryLogs
is enabled for a PostgresCluster with pgBouncer, CPK sets the logfile:
logfile = "/tmp/pgbouncer.log"
Note that pgBouncer logs to a file in the ephemeral /tmp
directory, so any restart of the
pgBouncer pod will wipe out previous logs.
pgAdmin4
When OpenTelemetryLogs
is enabled for a pgAdmin custom resource with a spec.instrumentation
block, CPK makes the following
adjustments to the pgAdmin configuration:
DATA_DIR = '/var/lib/pgadmin'
LOG_FILE = '/var/lib/pgadmin/logs/pgadmin.log'
JSON_LOGGER = True
CONSOLE_LOG_LEVEL = logging.WARNING
FILE_LOG_LEVEL = logging.INFO
FILE_LOG_FORMAT_JSON = {'time': 'created', 'name': 'name', 'level': 'levelname', 'message': 'message'}
CPK also makes changes to the Gunicorn settings to ensure that it logs to file at /var/lib/pgadmin/logs/gunicorn.log
,
and, like pgAdmin above, logs in json format.
NOTE: Only pgAdmins that are deployed using the PGAdmin custom resource can use the OpenTelemetry features. Logs from pgAdmins that are deployed via the PostgresCluster's spec.userInterface
configuration will not be collected.
Log Rotation & Retention
When OpenTelemetry logging is enabled for a PostgresCluster or PGAdmin, CPK will ensure that the components log to file. To ensure that these log files don't become unmanageable, CPK also manages log rotation and retention where possible.
To configure log retention for your PostgresCluster or PGAdmin, fill in the spec.instrumentation.logs.retentionPeriod
field on your spec:
spec:
instrumentation:
logs:
retentionPeriod: 2d
This retentionPeriod
field can be an RFC 3339 duration or a number and unit; the minimum unit is an hour,
and the maximum unit is a week.
The different components of a PostgresCluster or PGAdmin manage their rotation differently, so this setting is approximate. CPK will always retain at least the specified amount, but sometimes more will be retained.
NOTE: Patroni logs are not rotated by age, but by size. This can be set independently in the
spec.patroni.logging.storageLimit
field. If that field is left blank, CPK will default to
25M, which is the minimum limit for Patroni log storage. See our
guide to customizing Postgres instance logs
for more detail on this field.
Configuring Exporters
When you first turn on OpenTelemetry logging in CPK with no additional configuration, the logs that are collected are sent to the Debug Exporter, which outputs the logs to the console. Since the collector is running in a sidecar container in a Kubernetes Pod, that console output is added to the container logs which you can retrieve with the kubectl logs
command. If you were running a logging-enabled PostgresCluster named hippo
in the postgres-operator
namespace and wanted to see your postgres
, patroni
, and pgbackrest
logs from the primary instance pod, the commands to retrieve those logs would look like this:
PG_CLUSTER_PRIMARY_POD=$(kubectl get pod -n postgres-operator -o name -l postgres-operator.crunchydata.com/cluster=hippo,postgres-operator.crunchydata.com/role=master)
kubectl -n postgres-operator logs "${PG_CLUSTER_PRIMARY_POD}" -c collector
However, this output is not the easiest to read and is not well organized or easily filtered or searched. You will therefore almost certainly want to export your logs to a dedicated backend or service of some kind where you can more easily search and read through your logs.
Luckily, the OpenTelemetry Collector that we use has a plethora of exporters built into it that should satisfy most needs.
To use an exporter, you define it in the instrumentation.config.exporters
section. Fields in this section should follow the type[/name]
"component identifier" format where the type
is the exporter you want to use and name
is the optional name you want to give this configuration. The optional name allows you to define multiple configurations of the same exporter type. For example, you could have two configurations of the otlp
exporter where one is called otlp
and the second is called otlp/2
. The value for each field is the configuration for that exporter. For example:
spec:
instrumentation:
config:
exporters:
otlp:
endpoint: a-standalone-collector:4317
tls:
insecure: true
otlp/2:
endpoint: another-collector:4317
The configuration you define will differ depending on the exporter you are using. Please follow the documentation for your chosen exporter to determine what configuration to provide. OpenTelemetry keeps a list of exporters that are specific to the "contrib" collector, along with their documentation. There are also exporters that have their documentation in the base collector repo, but are also available in the "contrib" collector that we use.
Once the exporter is configured, you lastly need to tell the collector to use the exporter in the logging pipeline by adding the name of the exporter to the instrumentation.logs.exporters
array. For example:
spec:
instrumentation:
config:
exporters:
otlp/1:
endpoint: a-standalone-collector:4317
tls:
insecure: true
logs:
exporters: ['otlp/1']
You will find more examples of exporter configurations for commonly used logging backends in the Example Exporter Configurations section below.
Files
Some exporters might require that configuration be provided via files, such as separate config files, certificates, etc. This can be done via the instrumentation.config.files
section, which allows you to project files that are in Kubernetes Secrets, ConfigMaps, etc., into the volume that is mounted into the collector container. For example, creating a Secret with the following command:
kubectl -n postgres-operator create secret tls some-otel-exporter-certs --cert=server.crt --key=server.key
And then adding the following to your instrumentation spec:
spec:
instrumentation:
config:
files:
- secret:
name: some-otel-exporter-certs
Will result in the server.crt
and server.key
files being mounted in the /etc/otel-collector
directory of the collector container.
Batch Size
In between the collection of logs via the receiver components and the exporting of the logs via the exporter components, OpenTelemetry allows for transformation of the data via "processor" components. One of the processors that we use in our implementation is the Batch Processor, which compresses the data and reduces the number of network connections needed to export the data.
The size of the batches and how often they are sent is determined by three different settings:
- maxDelay - The maximum time to wait before exporting a batch, regardless of the batch's size. Higher numbers allow more records to be deduplicated and compressed before export.
- maxRecords - The maximum number of records to include in an exported batch. When present, batches this size are sent without any further delay.
- minRecords - The number of records to wait for before exporting a batch. Higher numbers allow more records to be deduplicated and compressed before export.
By default, maxRecords
is not set and the other two settings are configured as such:
- maxDelay = 200ms
- minRecords = 8192
You can configure these settings to your liking via the spec.instrumentation.logs.batches
section. For example:
spec:
instrumentation:
logs:
batches:
maxDelay: 1s
maxRecords: 16384
minRecords: 8192
If you wish to turn batching off entirely, you must set both maxDelay
and minRecords
to zero:
spec:
instrumentation:
logs:
batches:
maxDelay: 0s
minRecords: 0
Resource Detection
Another processor that we incorporate into our logs pipelines is the Resource Detection Processor, which can detect resource information from the host and add it as metadata to log records. The full list of supported detectors can be found in the processor's documentation.
You can configure one or more detectors via the spec.instrumentation.config.detectors
array, where each entry has a name
field that indicates which detector to use, and an optional attributes
field, where you can specify particular attributes that you wish to turn on or off. For example, if you were running CPK in an Azure Kubernetes Service cluster, you might configure this section like so:
spec:
instrumentation:
config:
detectors:
- name: aks
attributes:
k8s.cluster.name: true
See the Resource Detection Processor documentation for more details on the various detectors and their particular attributes.
Example Exporter Configurations
This section provides example configurations for a variety of different OpenTelemetry-compatible logging services and backends.
Google Cloud
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: otel-hippo
namespace: postgres-operator
spec:
instrumentation:
config:
detectors:
- name: gcp
exporters:
# https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/googlecloudexporter#configuration-reference
googlecloud:
log:
default_log_name: "collector-exported-log"
resource_filters:
- prefix: "k8s"
- prefix: "db"
logs:
exporters: ['googlecloud']
OTLP
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: otel-hippo
namespace: postgres-operator
spec:
instrumentation:
config:
exporters:
# https://github.com/open-telemetry/opentelemetry-collector/tree/main/exporter/otlpexporter#getting-started
otlp: # for exporting to another collector
endpoint: "otel-collector:4317"
tls:
insecure: true
logs:
exporters: ['otlp']