Navigation :

Crunchy Data Postgres Operator > Overview > Failover in the PostgreSQL Operator Overview

Failover in the PostgreSQL Operator Overview

Failover in the PostgreSQL Operator

There are a number of potential events that could cause a primary database instance or cluster to become unavailable during the course of normal operations, including:

A database storage (disk) failure or any other hardware failure
The network on which the database resides becomes unreachable
The host operating system becomes unstable and crashes
A key database file becomes corrupted
Total loss of data center

There may also be downtime events that are due to the normal case of operations, such as performing a minor upgrade, security patching of operating system, hardware upgrade, or other maintenance.

To enable rapid recovery from the unavailability of the primary PostgreSQL instance within a PostgreSQL cluster, the PostgreSQL Operator supports both Manual and Automated failover within a single Kubernetes cluster.

PostgreSQL Cluster Architecture

The failover from a primary PostgreSQL instances to a replica PostgreSQL instance within a PostgreSQL cluster.

Manual Failover

Manual failover is performed by PostgreSQL Operator API actions involving a query and then a target being specified to pick the fail-over replica target.

Automatic Failover

Automatic failover is performed by the PostgreSQL Operator by evaluating the readiness of a primary. Automated failover can be globally specified for all clusters or specific clusters. If desired, users can configure the PostgreSQL Operator to replace a failed primary PostgreSQL instance with a new PostgreSQL replica.

The PostgreSQL Operator automatic failover logic includes:

deletion of the failed primary Deployment
pick the best replica to become the new primary
label change of the targeted Replica to match the primary Service
execute the PostgreSQL promote command on the targeted replica