Friday, June 15, 2018

Even more fine load balancing control with Pgpool-II 4.0

While back go, I posted an article titled "More load balancing fine control" to explain a new feature of upcoming Pgpool-II 4.0. Today I would like talk about yet another new feature of load balancing in Pgpool-II.

Pgpool-II is already able to control read query load balancing in several granularity:
(See the document for more details).
  •  Whether enable load balancing or not (load_balance_mode)
  • By database (database_redirect_preference_list)
  • By application name (app_name_redirect_preference_list)
  • By functions used in the query (white_function_list, black_function_list)
  • Statement level (/*NO LOAD BALANCE*/ comment)
 The last method allows the statement level control on load balancing but it needs to rewrite a query which is often impossible if you are using commercial software.

A new configuration parameter called "black_query_pattern_list" allows you to disable load balancing for queries specified by the parameter. The parameter takes a regular expression string. If a query matched with the expression, load balancing is disabled for the query: the query is sent to the primary (master) PostgreSQL server.

Here are some examples.

We have:
black_query_pattern_list = 'SELECT \* FROM t1\;;'
in pgpool.conf. Note that some special characters must be qualified by using '\' (back slash) characters.

We have no SELECTs issued to both primary and standby.

test=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | last_status_change 
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
 0       | /tmp     | 11002 | up     | 0.500000  | primary | 0          | false             | 0                 | 2018-06-15 11:57:25
 1       | /tmp     | 11003 | up     | 0.500000  | standby | 0          | true              | 0                 | 2018-06-15 11:57:25
(2 rows)


If following query is issued, then the "select_cnt" column of standby should be incremented since standby is the load balance node (notice the "role" column").

test=# SELECT * FROM t1;

But because of the effect of  black_query_pattern_list, the SELECT is redirected to primary.

test=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | last_status_change 
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
 0       | /tmp     | 11002 | up     | 0.500000  | primary | 1          | false             | 0                 | 2018-06-15 11:57:25
 1       | /tmp     | 11003 | up     | 0.500000  | standby | 0          | true              | 0                 | 2018-06-15 11:57:25
(2 rows)



However, "SELECT * FROM t1 WHERE i = 0" will be sent to standby since the expression specied in black_query_pattern_list does not match with the query.

test=# select * from t1 where i = 1;
 i
---
(0 rows)


test=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | last_status_change 
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
 0       | /tmp     | 11002 | up     | 0.500000  | primary | 1          | false             | 0                 | 2018-06-15 11:57:25
 1       | /tmp     | 11003 | up     | 0.500000  | standby | 1          | true              | 0                 | 2018-06-15 11:57:25
(2 rows)


If you want to match with any query using table t2, you want to specify something like:

  .* t2;;.* t2 .*;

The first part ".* t2;" matches any query without any qualification (example: "SELECT * FROM t2"), while latter matches any query having space after t2 (example: "SELECT * FROM t2 WHERE i = 1).

Please note that Pgpool-II does not do any syntax analysis on the query here. Please be careful when specifying a regular expression. For instance, "SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2)" will not match with any of the regular expression pattern above. So the query will be sent to standby. This may or may not follow your expectations.

By the way, you may already noticed the extra column "last_status_change" of "show pool_nodes" command. This indicates the times stamp when any of "status" or "role" has been changed. This is useful when you want to see log lines when failover happend last time.

Dynamic spare process management in Pgpool-II

Pre-fork architecture in Pgpool-II Pgpool-II uses fixed number of pre-forked child process which is responsible for accepting and handling e...