Watchdog Configuration Example
pgpool-II 4.1.3 Documentation | |||
---|---|---|---|
Prev | Up | Chapter 8. Configuration Examples | Next |
This tutorial explains the simple way to try "Watchdog". What you need is 2 Linux boxes on which Pgpool-II is installed and a PostgreSQL on the same machine or in the other one. It is enough that 1 node for backend exists. You can use watchdog with Pgpool-II in any mode: replication mode, master/slave mode and raw mode.
This example uses use "osspc16" as an Active node and "osspc20" as a Standby node. "Someserver" means one of them.
8.2.1. Common configurations
Set the following parameters in both of active and standby nodes.
8.2.1.1. Enabling watchdog
First of all, set use_watchdog to on.
use_watchdog = on # Activates watchdog
8.2.1.2. Configure Up stream servers
Specify the up stream servers (e.g. application servers). Leaving it blank is also fine.
trusted_servers = '' # trusted server list which are used # to confirm network connection # (hostA,hostB,hostC,...)
8.2.1.3. Watchdog Communication
Specify the TCP port number for watchdog communication.
wd_port = 9000 # port number for watchdog service
8.2.1.4. Virtual IP
Specify the IP address to be used as a virtual IP address in the delegate_IP .
delegate_IP = '133.137.177.143' # delegate IP address
Note: Make sure the IP address configured as a Virtual IP should be free and is not used by any other machine.
8.2.2. Individual Server Configurations
Next, set the following parameters for each Pgpool-II . Specify other_pgpool_hostname , other_pgpool_port and other_wd_port with the values of other Pgpool-II server values.
8.2.2.1. Active (osspc16) Server configurations
other_pgpool_hostname0 = 'osspc20' # Host name or IP address to connect to for other pgpool 0 other_pgpool_port0 = 9999 # Port number for other pgpool 0 other_wd_port0 = 9000 # Port number for other watchdog 0
8.2.2.2. Standby (osspc20) Server configurations
other_pgpool_hostname0 = 'osspc16' # Host name or IP address to connect to for other pgpool 0 other_pgpool_port0 = 9999 # Port number for other pgpool 0 other_wd_port0 = 9000 # Port number for other watchdog 0
8.2.3. Starting Pgpool-II
Start Pgpool-II on each servers from root user with "-n" switch and redirect log messages into pgpool.log file.
8.2.3.1. Starting pgpool in Active server (osspc16)
First start the Pgpool-II on Active server.
[user@osspc16]$ su - [root@osspc16]# {installed_dir}/bin/pgpool -n -f {installed_dir}/etc/pgpool.conf > pgpool.log 2>&1
Log messages will show that Pgpool-II has the virtual IP address and starts watchdog process.
LOG: I am announcing my self as master/coordinator watchdog node LOG: I am the cluster leader node DETAIL: our declare coordinator message is accepted by all nodes LOG: I am the cluster leader node. Starting escalation process LOG: escalation process started with PID:59449 LOG: watchdog process is initialized LOG: watchdog: escalation started LOG: I am the master watchdog node DETAIL: using the local backend node status
8.2.3.2. Starting pgpool in Standby server (osspc20)
Now start the Pgpool-II on Standby server.
[user@osspc20]$ su - [root@osspc20]# {installed_dir}/bin/pgpool -n -f {installed_dir}/etc/pgpool.conf > pgpool.log 2>&1
Log messages will show that Pgpool-II has joined the watchdog cluster as standby watchdog.
LOG: watchdog cluster configured with 1 remote nodes LOG: watchdog remote node:0 on Linux_osspc16_9000:9000 LOG: interface monitoring is disabled in watchdog LOG: IPC socket path: "/tmp/.s.PGPOOLWD_CMD.9000" LOG: watchdog node state changed from [DEAD] to [LOADING] LOG: new outbound connection to Linux_osspc16_9000:9000 LOG: watchdog node state changed from [LOADING] to [INITIALIZING] LOG: watchdog node state changed from [INITIALIZING] to [STANDBY] LOG: successfully joined the watchdog cluster as standby node DETAIL: our join coordinator request is accepted by cluster leader node "Linux_osspc16_9000" LOG: watchdog process is initialized
8.2.4. Try it out
Confirm to ping to the virtual IP address.
[user@someserver]$ ping 133.137.177.142 PING 133.137.177.143 (133.137.177.143) 56(84) bytes of data. 64 bytes from 133.137.177.143: icmp_seq=1 ttl=64 time=0.328 ms 64 bytes from 133.137.177.143: icmp_seq=2 ttl=64 time=0.264 ms 64 bytes from 133.137.177.143: icmp_seq=3 ttl=64 time=0.412 ms
Confirm if the Active server which started at first has the virtual IP address.
[root@osspc16]# ifconfig eth0 ... eth0:0 inet addr:133.137.177.143 ... lo ...
Confirm if the Standby server which started not at first doesn't have the virtual IP address.
[root@osspc20]# ifconfig eth0 ... lo ...
Try to connect PostgreSQL by "psql -h delegate_IP -p port".
[user@someserver]$ psql -h 133.137.177.142 -p 9999 -l
8.2.5. Switching virtual IP
Confirm how the Standby server works when the Active server can't provide its service. Stop Pgpool-II on the Active server.
[root@osspc16]# {installed_dir}/bin/pgpool stop
Then, the Standby server starts to use the virtual IP address. Log shows:
LOG: remote node "Linux_osspc16_9000" is shutting down LOG: watchdog cluster has lost the coordinator node LOG: watchdog node state changed from [STANDBY] to [JOINING] LOG: watchdog node state changed from [JOINING] to [INITIALIZING] LOG: I am the only alive node in the watchdog cluster HINT: skipping stand for coordinator state LOG: watchdog node state changed from [INITIALIZING] to [MASTER] LOG: I am announcing my self as master/coordinator watchdog node LOG: I am the cluster leader node DETAIL: our declare coordinator message is accepted by all nodes LOG: I am the cluster leader node. Starting escalation process LOG: watchdog: escalation started LOG: watchdog escalation process with pid: 59551 exit with SUCCESS.
Confirm to ping to the virtual IP address.
[user@someserver]$ ping 133.137.177.142 PING 133.137.177.143 (133.137.177.143) 56(84) bytes of data. 64 bytes from 133.137.177.143: icmp_seq=1 ttl=64 time=0.328 ms 64 bytes from 133.137.177.143: icmp_seq=2 ttl=64 time=0.264 ms 64 bytes from 133.137.177.143: icmp_seq=3 ttl=64 time=0.412 ms
Confirm that the Active server doesn't use the virtual IP address any more.
[root@osspc16]# ifconfig eth0 ... lo ...
Confirm that the Standby server uses the virtual IP address.
[root@osspc20]# ifconfig eth0 ... eth0:0 inet addr:133.137.177.143 ... lo ...
Try to connect PostgreSQL by "psql -h delegate_IP -p port".
[user@someserver]$ psql -h 133.137.177.142 -p 9999 -l
8.2.6. More
8.2.6.1. Lifecheck
There are the parameters about watchdog's monitoring. Specify the interval to check wd_interval and the type of lifecheck wd_lifecheck_method . The hearbeat method specify the time to detect a fault wd_heartbeat_deadtime , the port number to receive wd_heartbeat_port , the interval to send wd_heartbeat_keepalive , the IP address or hostname of destination heartbeat_destination<emphasis>0</emphasis> and finally the destination port number heartbeat_destination_port<emphasis>0</emphasis> .
wd_lifecheck_method = 'heartbeat' # Method of watchdog lifecheck ('heartbeat' or 'query' or 'external') # (change requires restart) wd_interval = 10 # lifecheck interval (sec) > 0 wd_heartbeat_port = 9694 # Port number for receiving heartbeat signal # (change requires restart) wd_heartbeat_keepalive = 2 # Interval time of sending heartbeat signal (sec) # (change requires restart) wd_heartbeat_deadtime = 30 # Deadtime interval for heartbeat signal (sec) # (change requires restart) heartbeat_destination0 = 'host0_ip1' # Host name or IP address of destination 0 # for sending heartbeat signal. # (change requires restart) heartbeat_destination_port0 = 9694 # Port number of destination 0 for sending # heartbeat signal. Usually this is the # same as wd_heartbeat_port. # (change requires restart)
8.2.6.2. Switching virtual IP address
There are the parameters for switching the virtual IP address. Specify switching commands if_up_cmd , if_down_cmd , the path to them if_cmd_path , the command executed after switching to send ARP request arping_cmd and the path to it arping_path .
if_cmd_path = '/sbin' # path to the directory where if_up/down_cmd exists if_up_cmd = 'ip addr add $_IP_$/24 dev eth0 label eth0:0' # startup delegate IP command if_down_cmd = 'ip addr del $_IP_$/24 dev eth0' # shutdown delegate IP command arping_path = '/usr/sbin' # arping command path arping_cmd = 'arping -U $_IP_$ -w 1'
You can also bring up and bring down the virtual IP using arbitrary scripts specified wd_escalation_command and wd_de_escalation_command parameters.