Pgpoo-II + Watchdog Setup Example
pgpool-II 3.7.1 Documentation | |||
---|---|---|---|
Prev | Up | Chapter 7. Configuration Examples | Next |
This section shows an example of streaming replication configuration using Pgpool-II . In this example, we use 3 Pgpool-II servers to manage PostgreSQL servers to create a robust cluster system and avoid the single point of failure or split brain.
7.3.1. Requirements
We assume that all the Pgpool-II servers and the PostgreSQL servers are in the same subnet.
7.3.2. Cluster System Configuration
We use 2 PostgreSQL servers and 3 Pgpool-II servers with CentOS7. Let these servers be osspc16 , osspc17 , osspc18 , osspc19 and osspc20 .
Note: The roles of Active , Standy , Primary , Standby are not fixed and may be changed by further operations.
7.3.3. Installation
In this example, we install Pgpool-II and PostgreSQL by using RPM packages.
Table 7-2. Pgpool-II , PostgreSQL version informations and Configuration
Server | Version | Host Name | Port | $PGDATA Directory |
---|---|---|---|---|
PostgreSQL server (primary) | PostgreSQL 9.6.1 | osspc19 | 5432 | /var/lib/pgsql/9.6/data |
PostgreSQL server (standby) | PostgreSQL 9.6.1 | osspc20 | 5432 | /var/lib/pgsql/9.6/data |
Pgpool-II server | Pgpool-II 3.6.1 | osspc16 | 9999 | - |
Pgpool-II server | Pgpool-II 3.6.1 | osspc17 | 9999 | - |
Pgpool-II server | Pgpool-II 3.6.1 | osspc18 | 9999 | - |
Install Pgpool-II by using Pgpool-II YUM repository.
# yum install http://www.pgpool.net/yum/rpms/3.6/redhat/rhel-7-x86_64/pgpool-II-release-3.6-1.noarch.rpm # yum install pgpool-II-pg96 # yum install pgpool-II-pg96-debuginfo # yum install pgpool-II-pg96-devel # yum install pgpool-II-pg96-extensions
Install PostgreSQL by using PostgreSQL YUM repository.
# yum install https://yum.postgresql.org/9.6/redhat/rhel-7-x86_64/pgdg-centos96-9.6-3.noarch.rpm # yum install postgresql96 postgresql96-devel postgresql96-server
7.3.4. Before Starting
Before you start the configuration process, please check the following prerequisites.
-
Set up PostgreSQL streaming replication on the primary server. In this example, we use WAL archiving.
First, we create the directory /var/lib/pgsql/archivedir to store WAL segments on both PostgreSQL servers ( osspc19 and osspc20 ).
[PostgreSQL server]$ mkdir /var/lib/pgsql/archivedir
Then we edit the configuration file $PGDATA/postgresql.conf on osspc19 (primary) as follows.
listen_addresses = '*' wal_level = hot_standby max_wal_senders = 2 archive_mode = on archive_command = 'cp "%p" "/var/lib/pgsql/archivedir/%f"'
We use the online recovery functionality of Pgpool-II to setup standby server after the primary server is started.
-
Because of the security reasons, we create a user repl solely used for replication purpose, and a user pgpool for streaming replication delay check and health check of Pgpool-II . Assuming that all the Pgpool-II servers and the PostgreSQL servers are in the network of 133.137.174.0/24 , and edit pg_hba.conf to enable md5 authentication method.
host all pgpool 133.137.174.0/24 md5 host all all 0.0.0.0/0 md5 host replication repl 133.137.174.0/24 md5
-
To use the failover and online recovery of Pgpool-II , the settings that allow SSH without passowrd to other servers ( osspc16 - osspc20 ) are necessary.
-
To allow repl user without specifying password for streaming replication and online recovery, we create the .pgpass file in postgres user's home directory and change the permisson to 600 on both postgreSQL servers osspc19 and osspc20 .
[osspc19]$ cat /var/lib/pgsql/.pgpass osspc20:5432:replication:repl:<password of repl user>
[osspc20]$ cat /var/lib/pgsql/.pgpass osspc19:5432:replication:repl:<passowrd of repl user>
$ chmod 600 /var/lib/pgsql/.pgpass
-
When Pgpool-II connects to other Pgpool-II servers or PostgreSQL servers, the target port must be accessible by enabling firewall management softwares. Following is an example for CentOS/RHEL7 .
[PostgreSQL server]# firewall-cmd --permanent --zone=public --add-service=postgresql [PostgreSQL server]# firewall-cmd --reload
-
The following commands are to enable Pgpool-II and PostgreSQL start on system boot.
[Pgpool-II server]# systemctl enable pgpool.service
[PostgreSQL server]# systemctl enable postgresql.service
7.3.5. Pgpool-II Configuration
7.3.5.1. Common Settings
Here are the common settings on osspc16 , osspc17 and osspc18 .
When installing Pgpool-II from RPM, all the Pgpool-II configuration files are in /etc/pgpool-II . In this example, we copy the sample configuration file for streaming replicaton mode.
# cp /etc/pgpool-II/pgpool.conf.sample-stream /etc/pgpool-II/pgpool.conf
To allow Pgpool-II to accept all incoming connections, we set listen_addresses = '*' .
listen_addresses = '*'
Specifiy replication delay check user and password.
sr_check_user = 'pgpool' sr_check_password = 'pgpool'
Enable health check so that pgpool-II performs failover. Also, if the network is unstable, the health check fails even though the backend is running properly, failover or degenerate operation may occur. In order to prevent such incorrect detection of health check, we set health_check_max_retries = 10 .
health_check_period = 5 # Health check period # Disabled (0) by default health_check_timeout = 20 # Health check timeout # 0 means no timeout health_check_user = 'pgpool' health_check_password = 'pgpool' health_check_max_retries = 10
Specify the backend informations with osspc19 and osspc20 .
# - Backend Connection Settings - backend_hostname0 = 'osspc19' # Host name or IP address to connect to for backend 0 backend_port0 = 5432 # Port number for backend 0 backend_weight0 = 1 # Weight for backend 0 (only in load balancing mode) backend_data_directory0 = '/var/lib/pgsql/9.6/data' # Data directory for backend 0 backend_flag0 = 'ALLOW_TO_FAILOVER' # Controls various backend behavior # ALLOW_TO_FAILOVER or DISALLOW_TO_FAILOVER backend_hostname1 = 'osspc20' backend_port1 = 5432 backend_weight1 = 1 backend_data_directory1 = '/var/lib/pgsql/9.6/data' backend_flag1 = 'ALLOW_TO_FAILOVER'
7.3.5.2. Failover configuration
Specify failover_command to execute failover.sh script. The special characters %d %P %H %R in failover_command are replcaed with DB node ID of the detached node , Old primary node ID , Hostname of the new master node , Database cluster directory of the new master node .
failover_command = '/etc/pgpool-II/failover.sh %d %P %H %R'
Create /etc/pgpool-II/failover.sh , and set the file permisson to 755 .
# vi /etc/pgpool-II/failover.sh # chmod 755 /etc/pgpool-II/failover.sh
-
/etc/pgpool-II/failover.sh
#! /bin/sh -x # Execute command by failover. # special values: %d = node id # %h = host name # %p = port number # %D = database cluster path # %m = new master node id # %M = old master node id # %H = new master node host name # %P = old primary node id # %R = new master database cluster path # %r = new master port number # %% = '%' character falling_node=$1 # %d old_primary=$2 # %P new_primary=$3 # %H pgdata=$4 # %R pghome=/usr/pgsql-9.6 log=/var/log/pgpool/failover.log date >> $log echo "failed_node_id=$falling_node new_primary=$new_primary" >> $log if [ $falling_node = $old_primary ]; then if [ $UID -eq 0 ] then su postgres -c "ssh -T postgres@$new_primary $pghome/bin/pg_ctl promote -D $pgdata" else ssh -T postgres@$new_primary $pghome/bin/pg_ctl promote -D $pgdata fi exit 0; fi; exit 0;
7.3.5.3. Pgpool-II Online Recovery Configurations
Next, in order to perform online recovery with Pgpool-II we specify the PostgreSQL user name and online recovery command recovery_1st_stage . Then, we create recovery_1st_stage and pgpool_remote_start in database cluster directory of PostgreSQL primary server, and set the file permissions to 755 .
recovery_user = 'postgres' # Online recovery user recovery_password = 'postgres' # Online recovery password recovery_1st_stage_command = 'recovery_1st_stage'
$ vi /var/lib/pgsql/9.6/data/recovery_1st_stage $ vi /var/lib/pgsql/9.6/data/pgpool_remote_start $ chmod 755 /var/lib/pgsql/9.6/data/recovery_1st_stage $ chmod 755 /var/lib/pgsql/9.6/data/pgpool_remote_start
-
/var/lib/pgsql/9.6/data/recovery_1st_stage
#!/bin/bash -x # Recovery script for streaming replication. pgdata=$1 remote_host=$2 remote_pgdata=$3 port=$4 pghome=/usr/pgsql-9.6 archivedir=/var/lib/pgsql/archivedir hostname=$(hostname) ssh -T postgres@$remote_host " rm -rf $remote_pgdata $pghome/bin/pg_basebackup -h $hostname -U repl -D $remote_pgdata -x -c fast rm -rf $archivedir/* cd $remote_pgdata cp postgresql.conf postgresql.conf.bak sed -e 's/#*hot_standby = off/hot_standby = on/' postgresql.conf.bak > postgresql.conf rm -f postgresql.conf.bak cat > recovery.conf << EOT standby_mode = 'on' primary_conninfo = 'host="$hostname" port=$port user=repl' restore_command = 'scp $hostname:$archivedir/%f %p' EOT "
-
/var/lib/pgsql/9.6/data/pgpool_remote_start
#! /bin/sh -x pghome=/usr/pgsql-9.6 remote_host=$1 remote_pgdata=$2 # Start recovery target PostgreSQL server ssh -T $remote_host $pghome/bin/pg_ctl -w -D $remote_pgdata start > /dev/null 2>&1 < /dev/null &
In order to use the online recovery functionality, the functions of
pgpool_recovery
,
pgpool_remote_start
,
pgpool_switch_xlog
are required, so we need install
pgpool_recovery
on template1 of
PostgreSQL
server
osspc19
.
# su - postgres $ psql template1 =# CREATE EXTENSION pgpool_recovery;
7.3.5.4. Client Authentication Configuration
Because in the section Before Starting , we already set PostgreSQL authentication method to md5 , it is necessary to set a client authentication by Pgpool-II to connect to backend nodes. When installing from RPM, the Pgpool-II configuration file pool_hba.conf is in /etc/pgpool-II . By default, pool_hba authentication is disabled, and set enable_pool_hba = on to enable it.
enable_pool_hba = on
The format of pool_hba.conf file follows very closely PostgreSQL's pg_hba.conf format. Set pgpool and postgres user's authentication method to md5 .
host all pgpool 0.0.0.0/0 md5 host all postgres 0.0.0.0/0 md5
To use md5 authentication, we need to register the user name and password in file pool_passwd . Execute command pg_md5 --md5auth --username=<user name> <password> to regist user name and MD5-hashed password in file pool_passwd . If pool_passwd doesn't exist yet, it will be created in the same directory as pgpool.conf .
# pg_md5 --md5auth --username=pgpool <password of pgpool user> # pg_md5 --md5auth --username=postgres <password of postgres user>
7.3.5.5. Watchdog Configuration
Enable watchdog functionality on osspc16 , osspc17 , osspc18 .
use_watchdog = on
Specify virtual IP address that accepts connections from clients on osspc16 , osspc17 , osspc18 . Ensure that the IP address set to virtual IP isn't used yet.
delegate_IP = '133.137.174.153'
Specify the hostname and port number of each Pgpool-II server.
-
osspc16
wd_hostname = 'osspc16' wd_port = 9000
-
osspc17
wd_hostname = 'osspc17' wd_port = 9000
-
osspc18
wd_hostname = 'osspc18' wd_port = 9000
Specify the hostname, Pgpool-II port number, and watchdog port number of monitored Pgpool-II servers on each Pgpool-II server.
-
osspc16
# - Other pgpool Connection Settings - other_pgpool_hostname0 = 'osspc17' # Host name or IP address to connect to for other pgpool 0 # (change requires restart) other_pgpool_port0 = 9999 # Port number for othet pgpool 0 # (change requires restart) other_wd_port0 = 9000 # Port number for othet watchdog 0 # (change requires restart) other_pgpool_hostname1 = 'osspc18' other_pgpool_port1 = 9999 other_wd_port1 = 9000
-
osspc17
# - Other pgpool Connection Settings - other_pgpool_hostname0 = 'osspc16' # Host name or IP address to connect to for other pgpool 0 # (change requires restart) other_pgpool_port0 = 9999 # Port number for othet pgpool 0 # (change requires restart) other_wd_port0 = 9000 # Port number for othet watchdog 0 # (change requires restart) other_pgpool_hostname1 = 'osspc18' other_pgpool_port1 = 9999 other_wd_port1 = 9000
-
osspc18
# - Other pgpool Connection Settings - other_pgpool_hostname0 = 'osspc16' # Host name or IP address to connect to for other pgpool 0 # (change requires restart) other_pgpool_port0 = 9999 # Port number for othet pgpool 0 # (change requires restart) other_wd_port0 = 9000 # Port number for othet watchdog 0 # (change requires restart) other_pgpool_hostname1 = 'osspc17' other_pgpool_port1 = 9999 other_wd_port1 = 9000
Specify the hostname and port number of destination for sending heartbeat signal on osspc16 , osspc17 , osspc18 .
-
osspc16
heartbeat_destination0 = 'osspc17' # Host name or IP address of destination 0 # for sending heartbeat signal. # (change requires restart) heartbeat_destination_port0 = 9694 # Port number of destination 0 for sending # heartbeat signal. Usually this is the # same as wd_heartbeat_port. # (change requires restart) heartbeat_device0 = '' # Name of NIC device (such like 'eth0') # used for sending/receiving heartbeat # signal to/from destination 0. # This works only when this is not empty # and pgpool has root privilege. # (change requires restart) heartbeat_destination1 = 'osspc18' heartbeat_destination_port1 = 9694 heartbeat_device1 = ''
-
osspc17
heartbeat_destination0 = 'osspc16' # Host name or IP address of destination 0 # for sending heartbeat signal. # (change requires restart) heartbeat_destination_port0 = 9694 # Port number of destination 0 for sending # heartbeat signal. Usually this is the # same as wd_heartbeat_port. # (change requires restart) heartbeat_device0 = '' # Name of NIC device (such like 'eth0') # used for sending/receiving heartbeat # signal to/from destination 0. # This works only when this is not empty # and pgpool has root privilege. # (change requires restart) heartbeat_destination1 = 'osspc18' heartbeat_destination_port1 = 9694 heartbeat_device1 = ''
-
osspc18
heartbeat_destination0 = 'osspc16' # Host name or IP address of destination 0 # for sending heartbeat signal. # (change requires restart) heartbeat_destination_port0 = 9694 # Port number of destination 0 for sending # heartbeat signal. Usually this is the # same as wd_heartbeat_port. # (change requires restart) heartbeat_device0 = '' # Name of NIC device (such like 'eth0') # used for sending/receiving heartbeat # signal to/from destination 0. # This works only when this is not empty # and pgpool has root privilege. # (change requires restart) heartbeat_destination1 = 'osspc17' heartbeat_destination_port1 = 9694 heartbeat_device1 = ''
7.3.5.6. PCP Command Configuration
Because user authentication is required to use the PCP command, we specify user name and md5 encrypted password in pcp.conf . Here we create the encrypted password for user postgres , and add <username: encrypted password&gl in /etc/pgpool-II/pcp.conf .
# pg_md5 -p Password: (input password) (paste the md5 encrypted password to pcp.conf) # vi /etc/pgpool-II/pcp.conf (add password entry) user name:md5 encrypted password
The settings of Pgpool-II is completed.
7.3.6. Starting/Stopping Pgpool-II
Next we start Pgpool-II . Before starting Pgpool-II , please start PostgreSQL servers first. Also, when stopping PostgreSQL , it is necessary to stop Pgpool-II first.
-
Starting Pgpool-II
In section Before Starting , we already set the auto-start of Pgpool-II . To start Pgpool-II , restart the whole system or execute the following command.
# systemctl start pgpool.service
-
Stopping Pgpool-II
# systemctl stop pgpool.service
7.3.8. How to use
Let's start to use Pgpool-II . First, let's start Pgpool-II on osspc16 , osspc17 , osspc18 by using the following command.
# systemctl start pgpool.service
7.3.8.1. Set up PostgreSQL standby server
First, we should set up PostgreSQL standby server by using Pgpool-II online recovery functionality. Ensure that recovery_1st_stage and pgpool_remote_start scripts used by pcp_recovery_node command are in database cluster directory of PostgreSQL primary server ( osspc19 ).
# pcp_recovery_node -h 133.137.174.153 -p 9898 -U postgres -n 1
After executing pcp_recovery_node command, vertify that osspc20 is started as a PostgreSQL standby server.
# psql -h 133.137.174.153 -p 9999 -U pgpool postgres postgres=> show pool_nodes; node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay ---------+----------+------+--------+-----------+---------+------------+-------------------+------------------- 0 | osspc19 | 5432 | up | 0.500000 | primary | 0 | true | 0 1 | osspc20 | 5432 | up | 0.500000 | standby | 0 | false | 0
7.3.8.2. Switching active/standby watchdog
Confirm the watchdog status by using pcp_watchdog_info . The Pgpool-II server which is started first run as MASTER .
# pcp_watchdog_info -h 133.137.174.153 -p 9898 -U postgres Password: 3 YES osspc16:9999 Linux osspc16 osspc16 osspc16:9999 Linux osspc16 osspc16 9999 9000 4 MASTER #The Pgpool-II server started first becames "MASTER". osspc17:9999 Linux osspc17 osspc17 9999 9000 7 STANDBY #run as standby osspc18:9999 Linux osspc18 osspc18 9999 9000 7 STANDBY #run as standby
Stop active server osspc16 , then osspc17 or osspc18 will be promoted to active server. To stop osspc16 , we can stop Pgpool-II service or shutdown the whole system. Here, we stop Pgpool-II service.
[root@osspc16 ~]# systemctl stop pgpool.service [root@osspc16 ~]# pcp_watchdog_info -h 133.137.174.153 -p 9898 -U postgres Password: 3 YES osspc17:9999 Linux osspc17 osspc17 osspc17:9999 Linux osspc17 osspc17 9999 9000 4 MASTER #osspc17 is promoted to MASTER osspc16:9999 Linux osspc16 osspc16 9999 9000 10 SHUTDOWN #osspc16 is stopped osspc18:9999 Linux osspc18 osspc18 9999 9000 7 STANDBY #osspc18 runs as STANDBY
Start Pgpool-II ( osspc16 ) which we have stopped again, and vertify that osspc16 runs as a standby.
[root@osspc16 ~]# systemctl start pgpool.service [root@osspc16 ~]# pcp_watchdog_info -h 133.137.174.153 -p 9898 -U postgres Password: 3 YES osspc17:9999 Linux osspc17 osspc17 osspc17:9999 Linux osspc17 osspc17 9999 9000 4 MASTER osspc16:9999 Linux osspc16 osspc16 9999 9000 7 STANDBY osspc18:9999 Linux osspc18 osspc18 9999 9000 7 STANDBY
7.3.8.3. Failover
First, use psql to connect to PostgreSQL via virtual IP, and verify the backend informations.
# psql -h 133.137.174.153 -p 9999 -U pgpool postgres postgres=> show pool_nodes; node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay ---------+----------+------+--------+-----------+---------+------------+-------------------+------------------- 0 | osspc19 | 5432 | up | 0.500000 | primary | 0 | true | 0 1 | osspc20 | 5432 | up | 0.500000 | standby | 0 | false | 0
Next, stop primary PostgreSQL server osspc19 , and verify automatic failover.
$ pg_ctl -D /var/lib/pgsql/9.6/data -m immediate stop
After stopping PostgreSQL on osspc19 , failover occurs and PostgreSQL on osspc20 becomes new primary DB.
# psql -h 133.137.174.153 -p 9999 -U pgpool postgres postgres=> show pool_nodes; node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay ---------+----------+------+--------+-----------+---------+------------+-------------------+------------------- 0 | osspc19 | 5432 | down | 0.500000 | standby | 0 | false | 0 1 | osspc20 | 5432 | up | 0.500000 | primary | 0 | true | 0
7.3.8.4. Online Recovery
Here, we use Pgpool-II online recovery functionality to restore osspc19 (old primary server) as a standby. Before restoring the old primary server, please ensure that recovery_1st_stage and pgpool_remote_start scripts exist in database cluster directory of current primary server osspc20 .
# pcp_recovery_node -h 133.137.174.153 -p 9898 -U postgres -n 0
Then verify that osspc19 is started as a standby.
# psql -h 133.137.174.153 -p 9999 -U pgpool postgres postgres=> show pool_nodes; node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay ---------+----------+------+--------+-----------+---------+------------+-------------------+------------------- 0 | osspc19 | 5432 | up | 0.500000 | standby | 0 | false | 0 1 | osspc20 | 5432 | up | 0.500000 | primary | 0 | true | 0