Failover and Failback

5.9.1. Failover and Failback Settings

failover_command ( string )

Specifies a user command to run when a PostgreSQL backend node gets detached. Pgpool-II replaces the following special characters with the backend specific information.

Table 5-6. failover command options

Special character Description
%d DB node ID of the detached node
%h Hostname of the detached node
%p Port number of the detached node
%D Database cluster directory of the detached node
%M Old master node ID
%m New master node ID
%H Hostname of the new master node
%P Old primary node ID
%r Port number of the new master node
%R Database cluster directory of the new master node
%% '%' character

Note: The "master node" referes to a node which has the "youngest (or the smallest) node id" among live the database nodes. In streaming replication mode , this may be different from primary node. In Table 5-6 , %m is the new master node chosen by Pgpool-II . It is the node being assigned the youngest (smallest) node id which is alive. For example if you have 3 nodes, namely node 0, 1, 2. Suppose node 1 the primary and all of them are healthy (no down node). If node 1 fails, failover_command is called with %m = 0.

Note: When a failover is performed, basically Pgpool-II kills all its child processes, which will in turn terminate all the active sessions to Pgpool-II . After that Pgpool-II invokes the failover_command and after the command completion Pgpool-II starts new child processes which makes it ready again to accept client connections.

However from Pgpool-II 3.6, in the steaming replication mode, client sessions will not be disconnected when a failover occurs any more if the session does not use the failed standby server. If the primary server goes down, still all sessions will be disconnected. Health check timeout case will also cause the full session disconnection. Other health check error, including retry over case does not trigger full session disconnection.

Note: You can run psql (or whatever command) against backend to retrieve some information in the script, but you cannot run psql against Pgpool-II itself, since the script is called from Pgpool-II and it needs to run while Pgpool-II is working on failover.

This parameter can be changed by reloading the Pgpool-II configurations.

failback_command ( string )

Specifies a user command to run when a PostgreSQL backend node gets attached to Pgpool-II . Pgpool-II replaces the following special characters with the backend specific information. before excuting the command.

Table 5-7. failback command options

Special character Description
%d DB node ID of the attached node
%h Hostname of the attached node
%p Port number of the attached node
%D Database cluster directory of the attached node
%M Old master node ID
%m New master node ID
%H Hostname of the new master node
%P Old primary node ID
%r Port number of the new master node
%R Database cluster directory of the new master node
%% '%' character

Note: You can run psql (or whatever command) against backend to retrieve some information in the script, but you cannot run psql against Pgpool-II itself, since the script is called from Pgpool-II and it needs to run while Pgpool-II is working on failover.

This parameter can be changed by reloading the Pgpool-II configurations.

follow_master_command ( string )

Specifies a user command to run after failover on the primary node failover. This works only in Master Replication mode with streaming replication. Pgpool-II replaces the following special characters with the backend specific information before excuting the command.

Table 5-8. follow master command options

Special character Description
%d DB node ID of the detached node
%h Hostname of the detached node
%p Port number of the detached node
%D Database cluster directory of the detached node
%M Old master node ID
%m New primary node ID
%H Hostname of the new primary node
%P Old primary node ID
%r Port number of the new primary node
%R Database cluster directory of the new primary node
%% '%' character

Note: If follow_master_command is not empty, then after failover on the primary node gets completed in Master Slave mode with streaming replication, Pgpool-II degenerates all nodes excepted the new primary and starts new child processes to be ready again to accept connections from the clients. After this, Pgpool-II executes the command configured in the follow_master_command for each degenerated backend nodes.

Typically follow_master_command command is used to recover the slave from the new primary by calling the pcp_recovery_node command.

This parameter can be changed by reloading the Pgpool-II configurations.

failover_on_backend_error ( boolean )

When set to on, Pgpool-II considers the reading/writing errors on the PostgreSQL backend connection as the backend node failure and trigger the failover on that node after disconnecting the current session. When this is set to off, Pgpool-II only report an error and disconnect the session in case of such errors.

Note: It is recommended to turn on the backend health checking (see Section 5.8 ) when failover_on_backend_error is set to off. Note, however, that Pgpool-II still triggers the failover when it detects the administrative shutdown of PostgreSQL backend server. If you want to avoid a fail over even in this case, you need to specify DISALLOW_TO_FAILOVER on backend_flag .

This parameter can be changed by reloading the Pgpool-II configurations.

Note: Prior to Pgpool-II V4.0 , this configuration parameter name was fail _ over_on_backend_error .

search_primary_node_timeout ( integer )

Specifies the maximum amount of time in seconds to search for the primary node when a failover scenario occurs. Pgpool-II will give up looking for the primary node if it is not found with-in this configured time. Default is 300 and Setting this parameter to 0 means keep trying forever.

This parameter is only applicable in the streaming replication mode.

This parameter can be changed by reloading the Pgpool-II configurations.

detach_false_primary ( boolean )

If set to on, detach false primary node. The default is off. This parameter is only valid in streaming replication mode and for PostgreSQL 9.6 or after since this feature uses pg_stat_wal_receiver . If PostgreSQL 9.5.x or older version is used, no error is raised, just the feature is ignored.

If there's no primary node, no checking will be performed.

If there's no standby node, and there's only one primary node, no checking will be performed.

If there's no standby node, and there's multiple primary nodes, leave the primary node which has the youngest node id and detach rest of primary nodes.

If there are one or more primaries and one or more standbys, check the connectivity between primary and standby nodes by using pg_stat_wal_receiver if PostgreSQL 9.6 or after. In this case if a primary node connects to all standby nodes, the primary is regarded as "true" primary. Other primaries are regarded as "false" primary and the false primaries will be detached if detach_false_primary is true. If no "true" primary is found, nothing will happen.

When Pgpool-II starts, the checking of false primaries are performed only once in the Pgpool-II main process. If sr_check_period is greater than 0, the false primaries checking will be performed at the same timing of streaming replication delay checking.

This parameter is only applicable in the streaming replication mode.

This parameter can be changed by reloading the Pgpool-II configurations.

Figure 5-1. Detecting false primaries

5.9.2. Failover in the raw Mode

Failover can be performed in raw mode if multiple backend servers are defined. Pgpool-II usually accesses the backend specified by backend_hostname0 during normal operation. If the backend_hostname0 fails for some reason, Pgpool-II tries to access the backend specified by backend_hostname1 . If that fails, Pgpool-II tries the backend_hostname2, 3 and so on.