Integrating external lifecheck with watchdog
pgpool-II 3.7.1 Documentation | |||
---|---|---|---|
Prev | Up | Chapter 2. Watchdog | Next |
Pgpool-II watchdog process uses the BSD sockets for communicating with all the Pgpool-II processes and the same BSD socket can also be used by any third party system to provide the lifecheck function for local and remote Pgpool-II watchdog nodes. The BSD socket file name for IPC is constructed by appending Pgpool-II wd_port after "s.PGPOOLWD_CMD." string and the socket file is placed in the wd_ipc_socket_dir directory.
2.2.1. Watchdog IPC command packet format
The watchdog IPC command packet consists of three fields. Below table details the message fields and description.
2.2.2. Watchdog IPC result packet format
The watchdog IPC command result packet consists of three fields. Below table details the message fields and description.
2.2.3. Watchdog IPC command packet types
The first byte of the IPC command packet sent to watchdog process and the result returned by watchdog process is identified as the command or command result type. The below table lists all valid types and their meanings
Table 2-3. Watchdog IPC command packet types
Name | Byte Value | Type | Description |
---|---|---|---|
REGISTER FOR NOTIFICATIONS | '0' | Command packet | Command to register the current connection to receive watchdog notifications |
NODE STATUS CHANGE | '2' | Command packet | Command to inform watchdog about node status change of watchddog node |
GET NODES LIST | '3' | Command packet | Command to get the list of all configured watchdog nodes |
NODES LIST DATA | '4' | Result packet | The JSON data in packet contains the list of all configured watchdog nodes |
CLUSTER IN TRANSITION | '7' | Result packet | Watchdog returns this packet type when it is not possible to process the command because the cluster is transitioning. |
RESULT BAD | '8' | Result packet | Watchdog returns this packet type when the IPC command fails |
RESULT OK | '9' | Result packet | Watchdog returns this packet type when IPC command succeeds |
2.2.4. External lifecheck IPC packets and data
"GET NODES LIST" ,"NODES LIST DATA" and "NODE STATUS CHANGE" IPC messages of watchdog can be used to integration an external lifecheck systems. Note that the built-in lifecheck of pgpool also uses the same channel and technique.
2.2.4.1. Getting list of configured watchdog nodes
Any third party lifecheck system can send the "GET NODES LIST" packet on watchdog IPC socket with a JSON data containing the authorization key and value if wd_authkey is set or empty packet data when wd_authkey is not configured to get the "NODES LIST DATA" result packet.
The result packet returnd by watchdog for the "GET NODES LIST" will contains the list of all configured watchdog nodes to do health check on in the JSON format. The JSON of the watchdog nodes contains the "WatchdogNodes" Array of all watchdog nodes. Each watchdog JSON node contains the "ID" , "NodeName" , "HostName" , "DelegateIP" , "WdPort" and "PgpoolPort" for each node.
-- The example JSON data contained in "NODES LIST DATA" { "NodeCount":3, "WatchdogNodes": [ { "ID":0, "State":1, "NodeName":"Linux_ubuntu_9999", "HostName":"watchdog-host1", "DelegateIP":"172.16.5.133", "WdPort":9000, "PgpoolPort":9999 }, { "ID":1, "State":1, "NodeName":"Linux_ubuntu_9991", "HostName":"watchdog-host2", "DelegateIP":"172.16.5.133", "WdPort":9000, "PgpoolPort":9991 }, { "ID":2, "State":1, "NodeName":"Linux_ubuntu_9992", "HostName":"watchdog-host3", "DelegateIP":"172.16.5.133", "WdPort":9000, "PgpoolPort":9992 } ] } -- Note that ID 0 is always reserved for local watchdog node
After getting the configured watchdog nodes information from the watchdog the external lifecheck system can proceed with the health checking of watchdog nodes, and when it detects some status change of any node it can inform that to watchdog using the "NODE STATUS CHANGE" IPC messages of watchdog. The data in the message should contain the JSON with the node ID of the node whose status is changed (The node ID must be same as returned by watchdog for that node in WatchdogNodes list) and the new status of node.
-- The example JSON to inform pgpool-II watchdog about health check failed on node with ID 1 will look like { "NodeID":1, "NodeStatus":1, "Message":"optional message string to log by watchdog for this event" "IPCAuthKey":"wd_authkey configuration parameter value" } -- NodeStatus values meanings are as follows NODE STATUS DEAD = 1 NODE STATUS ALIVE = 2