Testing Online Recovery
pgpool-II 4.1.4 Documentation | |||
---|---|---|---|
Prev | Up | Chapter 1. Getting Started | Next |
Pgpool-II allows to recover a downed node by technique called "Online Recovery". This copies data from the primary node to a standby node so that it sync with the primary. This may take long time and database may be updated during the process. That's no problem because in the streaming configuration, the standby will receive WAL log and applies it to catch up the primary. To test online recovery, let's start with previous cluster, where node 0 is in down state.
$ pcp_recovery_node -p 11001 -n 0 Password: pcp_recovery_node -- Command Successful $ psql -p 11000 -c "show pool_nodes" test node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | last_status_change ---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+--------------------- 0 | /tmp | 11002 | up | 0.500000 | standby | 0 | false | 0 | 2019-01-31 12:06:48 1 | /tmp | 11003 | up | 0.500000 | primary | 0 | true | 0 | 2019-01-31 12:05:20 (2 rows)
pcp_recovery_node is one of control commands coming with Pgpool-II installation. The argument -p is to specify the port number assigned to the command, which is 11001 set by pgpool_setup . The argument -n is to specify the node id to be recovered. After executing the command, node 0 returned to "up" status.
The script executed by pcp_recovery_node is specified as "recovery_1st_stage_command" in pgpool.conf . Here is the file installed by pgpool_setup .
#! /bin/sh psql=/usr/local/pgsql/bin/psql DATADIR_BASE=/home/t-ishii/tmp/Tutorial PGSUPERUSER=t-ishii master_db_cluster=$1 recovery_node_host_name=$2 DEST_CLUSTER=$3 PORT=$4 recovery_node=$5 pg_rewind_failed="true" log=$DATADIR_BASE/log/recovery.log echo >> $log date >> $log if [ $pg_rewind_failed = "true" ];then $psql -p $PORT -c "SELECT pg_start_backup('Streaming Replication', true)" postgres echo "source: $master_db_cluster dest: $DEST_CLUSTER" >> $log rsync -C -a -c --delete --exclude postgresql.conf --exclude postmaster.pid \ --exclude postmaster.opts --exclude pg_log \ --exclude recovery.conf --exclude recovery.done \ --exclude pg_xlog \ $master_db_cluster/ $DEST_CLUSTER/ rm -fr $DEST_CLUSTER/pg_xlog mkdir $DEST_CLUSTER/pg_xlog chmod 700 $DEST_CLUSTER/pg_xlog rm $DEST_CLUSTER/recovery.done fi cat > $DEST_CLUSTER/recovery.conf $lt;$lt;REOF standby_mode = 'on' primary_conninfo = 'port=$PORT user=$PGSUPERUSER' recovery_target_timeline='latest' restore_command = 'cp $DATADIR_BASE/archivedir/%f "%p" 2> /dev/null' REOF if [ $pg_rewind_failed = "true" ];then $psql -p $PORT -c "SELECT pg_stop_backup()" postgres fi if [ $pg_rewind_failed = "false" ];then cp /tmp/postgresql.conf $DEST_CLUSTER/ fi