Restrictions

This section describes current restrictions of Pgpool-II .

Functionality of PostgreSQL

If you use pg_terminate_backend() to stop a backend, this will trigger a failover. The reason why this happens is that PostgreSQL sends exactly the same message for a terminated backend as for a full postmaster shutdown. There is no workaround prior of version 3.6. From version 3.6, this limitation has been mitigated. If the argument to the function (that is a process id) is a constant, you can safely use the function. In extended protocol mode, you cannot use the function though.

Load Balancing

Multi-statement queries (multiple SQL commands on single line) are always sent to primary node (in streaming replication mode) or main node (in other modes). Usually Pgpool-II dispatch query to appropriate node, but it's not applied to multi-statement queries.

Authentication/Access Controls

In the replication mode or native replication mode, trust and pam methods are supported. md5 is also supported since Pgpool-II 3.0. md5 is supported by using an authentication file pool_passwd . scram-sha-256, cert, and clear text password is also supported since Pgpool-II 4.0. pool_passwd is default name of the authentication file. Here are the steps to enable md5 authentication:

  1. Login as the database's operating system user and type:

    	 pg_md5 --md5auth --username=your_username your_passwd
    	

    user name and md5 encrypted password are registered into pool_passwd. If pool_passwd does not exist yet, pg_md5 command will automatically create it for you. The format of pool_passwd is username:encrypted_passwd .

  2. You also need to add an appropriate md5 entry to pool_hba.conf. See Section 6.1 for more details.

  3. Please note that the user name and password must be identical to those registered in PostgreSQL .

  4. After changing md5 password (in both pool_passwd and PostgreSQL of course), you need to execute pgpool reload .

Large objects

In streaming replication mode, Pgpool-II supports large objects.

In native replication mode, Pgpool-II supports large objects if the backend is PostgreSQL 8.1 or later. For this, you need to enable lobj_lock_table directive in pgpool.conf . Large object replication using backend function lo_import is not supported, however.

In other mode, including Slony mode, large objects are not supported.

Temporary tables

Creating/inserting/updating/deleting temporary tables are always executed on the primary in native replication mode. SELECT on these tables is executed on primary as well. However if the temporary table name is used as a literal in SELECT, there's no way to detect it, and the SELECT will be load balanced. That will trigger a "not found the table" error or will find another table having same name. To avoid the problem, use /*NO LOAD BALANCE*/ SQL comment.

Note that such literal table names used in queries to access system catalogs do cause problems described above. psql's \d command produces such that query:

       SELECT 't1'::regclass::oid;
      

In such that case Pgpool-II always sends the query to primary and will not cause the problem.

Tables created by CREATE TEMP TABLE will be deleted at the end of the session by specifying DISCARD ALL in reset_query_list if you are using PostgreSQL 8.3 or later.

For 8.2.x or earlier, tables created by CREATE TEMP TABLE will not be deleted after exiting a session. It is because of the connection pooling which, from PostgreSQL's backend point of view, keeps the session alive. To avoid this, you must explicitly drop the temporary tables by issuing DROP TABLE , or use CREATE TEMP TABLE ... ON COMMIT DROP inside the transaction block.

Functions, etc. In Native Replication mode

There is no guarantee that any data provided using a context-dependent mechanism (e.g. random number, transaction ID, OID, SERIAL, sequence), will be replicated correctly on multiple backends. For SERIAL, enabling insert_lock will help replicating data. insert_lock also helps SELECT setval() and SELECT nextval().

INSERT/UPDATE using CURRENT_TIMESTAMP , CURRENT_DATE , now() will be replicated correctly. INSERT/UPDATE for tables using CURRENT_TIMESTAMP , CURRENT_DATE , now() as their DEFAULT values will also be replicated correctly. This is done by replacing those functions by constants fetched from primary at query execution time. There are a few limitations however:

In Pgpool-II 3.0 or before, the calculation of temporal data in table default value is not accurate in some cases. For example, the following table definition:

       CREATE TABLE rel1(
       d1 date DEFAULT CURRENT_DATE + 1
       )
      

is treated the same as:

       CREATE TABLE rel1(
       d1 date DEFAULT CURRENT_DATE
       )
      

Pgpool-II 3.1 or later handles these cases correctly. Thus the column "d1" will have tomorrow as the default value. However this enhancement does not apply if extended protocols (used in JDBC, PHP PDO for example) or PREPARE are used.

Please note that if the column type is not a temporal one, rewriting is not performed. Such example:

       foo bigint default (date_part('epoch'::text,('now'::text)::timestamp(3) with time zone) * (1000)::double precision)
      

Suppose we have the following table:

       CREATE TABLE rel1(
       c1 int,
       c2 timestamp default now()
       )
      

We can replicate

       INSERT INTO rel1(c1) VALUES(1)
      

since this turn into

       INSERT INTO rel1(c1, c2) VALUES(1, '2009-01-01 23:59:59.123456+09')
      

However,

       INSERT INTO rel1(c1) SELECT 1
      

cannot to be transformed, thus cannot be properly replicated in the current implementation. Values will still be inserted, with no transformation at all.

SQL type commands

SQL type commands cannot be used in extended query mode.

Multi-byte Characters

Pgpool-II does not perform encoding conversion between client and PostgreSQL for multi-byte characters. The encoding for the client and backends must be the same.

Multi-statement Query

Pgpool-II cannot process multi-statement queries. However, when Pgpool-II is connected by psql , It is no problem. psql parse multi-statement, send a statement one by one.

libpq

libpq is linked while building Pgpool-II . libpq version must be 3.0 or later. Building Pgpool-II with libpq version 2.0 will fail.

ParameterStatus

When a client connects to PostgreSQL , PostgreSQL sends back some parameter/value pairs to clients. This protocol is called ParameterStatus . The parameter/value pairs can be extracted by using some APIs such as PQParameterStatus of libpq. The actual parameter names can be found here . Pgpool-II collects ParameterStatus values from multiple PostgreSQL servers and it is possible that the values vary among the servers. A typical example is in_hot_standby , which is introduced in PostgreSQL 14. The value for the variable is off on primary server and on on standby servers. Problem is, Pgpool-II has to return client only one of them. In this case it chooses the value reported by the primary server. So PQParameterStatus will return off . On the other hand, when the client issues show in_hot_standby , the returned value can either on or off depending on which is the load balance node for the session.

Note that if the values differ among servers, Pgpool-II will emit a log message except in_hot_standby . This is to prevent the log file from being flooded since in_hot_standby always differs.

set_config

PostgreSQL has set_config function which allows to change parameter values within current session like SET command (actually set_config has more feature than SET. See PostgreSQL manual for more details). When Pgpool-II is operated with clustering mode being set to streaming_replication , it sends the function only to the primary server. As the function is not sent to the standby servers, the parameter values are different among each servers. To avoid the problem, you can use SET command instead of set_config . Since SET command is sent to all servers used for this session, the issue will not happen. However, if you use more than 2 PostgreSQL servers, you need to disable statement_level_load_balance and use SET command. This is because, if statement_level_load_balance enabled, queries might be sent to the third server in addition to the primary server and the server which is assigned to the load balance node.

If you need to use set_config , turn off load balancing for the session (not only for set_config , load balancing should be disabled in the whole session). You can avoid the issue by sacrificing performance.