Testing Fail Over
pgpool-II 4.0.1 Documentation | |||
---|---|---|---|
Prev | Up | Chapter 1. Getting Started | Next |
Pgpool-II allows an automatic fail over when PostgreSQL server goes down. In this case Pgpool-II sets the status of the server to "down" and continue the database operation using remaining servers.
$ pg_ctl -D data1 stop waiting for server to shut down.... done server stopped $ psql -p 11000 -c "show pool_nodes" test node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay ---------+----------+-------+--------+-----------+---------+------------+-------------------+------------------- 0 | /tmp | 11002 | up | 0.500000 | primary | 2172 | true | 0 1 | /tmp | 11003 | down | 0.500000 | standby | 0 | false | 0 (2 rows) $ psql -p 11000 -c "SELECT sum(abalance) FROM pgbench_accounts" test sum -------- 192112 (1 row)
The standby node was shut down by pg_ctl command. Pgpool-II detects it and detaches the standby node. "show pool_nodes" command shows that the standby node is in down status. You can continue to use the cluster without the standby node:
$ psql -p 11000 -c "SELECT sum(abalance) FROM pgbench_accounts" test sum -------- 192112 (1 row)
What happens if the primary server goes down? In this case, one of remaining standby server is promoted to new primary server. For this testing, we start from the state in which both nodes are up.
$ psql -p 11000 -c "show pool_nodes" test node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay ---------+----------+-------+--------+-----------+---------+------------+-------------------+------------------- 0 | /tmp | 11002 | up | 0.500000 | primary | 2173 | true | 0 1 | /tmp | 11003 | up | 0.500000 | standby | 0 | false | 0 (2 rows) $ pg_ctl -D data0 stop waiting for server to shut down.... done server stopped $ psql -p 11000 -c "show pool_nodes" test node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay ---------+----------+-------+--------+-----------+---------+------------+-------------------+------------------- 0 | /tmp | 11002 | down | 0.500000 | standby | 2173 | false | 0 1 | /tmp | 11003 | up | 0.500000 | primary | 0 | true | 0 (2 rows)
Now the primary node is changed from 0 to 1. What happens inside? When the node 0 goes down, Pgpool-II detects it and executes "fail_over_script" defined in pgpool.conf . Here is the content of the file.
#! /bin/sh # Execute command by failover. # special values: %d = node id # %h = host name # %p = port number # %D = database cluster path # %m = new master node id # %M = old master node id # %H = new master node host name # %P = old primary node id # %R = new master database cluster path # %r = new master port number # %% = '%' character failed_node_id=$1 failed_host_name=$2 failed_port=$3 failed_db_cluster=$4 new_master_id=$5 old_master_id=$6 new_master_host_name=$7 old_primary_node_id=$8 new_master_port_number=$9 new_master_db_cluster=${10} mydir=/home/t-ishii/tmp/Tutorial log=$mydir/log/failover.log pg_ctl=/usr/local/pgsql/bin/pg_ctl cluster0=$mydir/data0 cluster1=$mydir/data1 date >> $log echo "failed_node_id $failed_node_id failed_host_name $failed_host_name failed_port $failed_port failed_db_cluster $failed_db_cluster new_master_id $new_master_id old_master_id $old_master_id new_master_host_name $new_master_host_name old_primary_node_id $old_primary_node_id new_master_port_number $new_master_port_number new_master_db_cluster $new_master_db_cluster" >> $log if [ a"$failed_node_id" = a"$old_primary_node_id" ];then # master failed ! new_primary_db_cluster=${mydir}/data"$new_master_id" echo $pg_ctl -D $new_primary_db_cluster promote >>$log # let standby take over $pg_ctl -D $new_primary_db_cluster promote >>$log # let standby take over fi
The script receives necessary information as parameters from Pgpool-II . If the primary server goes down, it executes "pg_ctl -D data1 promote", which should promote the standby server to a new primary server.