This time I would like to introduce "statement level load balancing" feature of 4.1.
Pgpool-II can distribute read queries among PostgreSQL backend nodes. This allows to design a scale out cluster using PostgreSQL. The particular database node used for distributing read query is determined at the session level: when a client connects to Pgpool-II. This is so called "session level load balancing".
For example, if a client connects to Pgpool-II and the load balance node is node 1 (we assume that this is a streaming replication standby node), then any read query will be sent to the load balance node (in this case node1, the standby node). Different session may choose master (primary) as the load balance node. The distribution ratio of the load balancing node selection in different sessions is determined by "backend weight" parameter in the Pgpool-II configuration file (usually named "pgpool.conf"), typically "backend_weight0" or "backend_weight1", corresponding to node 0 and node 1 respectively.
This is good as long as clients connects to Pgpool-II, issue some queries, and disconnect, since next time a client connects to Pgpool-II, different load balance node may be chosen according to the backend weight parameters.
However, if your client already has a connection pooling feature, this way (session level load balancing) might be a problem, since the selection of load balance node is performed only once when the connection pooling from client to Pgpool-II is created.
The statement level load balancing feature is created to solve the problem. Unlike the session level load balancing, the load balancing node is determined when a new query is issued. The new parameter for this is "statement_level_load_balance". If this is set to on, the feature is enabled (the parameter can be changed by reloading the pgpool.conf).
At first "select_cnt" is 0, which means no SELECTs were issued.
test=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | last_status_change
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
0 | /tmp | 11002 | up | 0.500000 | primary | 0 | true | 0 | 2019-04-02 15:36:58
1 | /tmp | 11003 | up | 0.500000 | standby | 0 | false | 0 | 2019-04-02 15:36:58
(2 rows)
Let's issue a SELECT.
test=# select 1;
?column?
----------
1
(1 row)
test=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | last_status_change
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
0 | /tmp | 11002 | up | 0.500000 | primary | 1 | true | 0 | 2019-04-02 15:36:58
1 | /tmp | 11003 | up | 0.500000 | standby | 0 | false | 0 | 2019-04-02 15:36:58
(2 rows)
Now the select_cnt of node 0 is 1, which means the SELECT was sent to node 0. Also please note that "load_balance_node" colum of node 0 is "true", which means node 0 is chosen as the load balance node in the last query.
Ok, let's issue another SELECT:
test=# select 2;
?column?
----------
2
(1 row)
test=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | last_status_change
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
0 | /tmp | 11002 | up | 0.500000 | primary | 1 | false | 0 | 2019-04-02 15:36:58
1 | /tmp | 11003 | up | 0.500000 | standby | 1 | true | 0 | 2019-04-02 15:36:58
(2 rows)
Now the load_balance_node is changed to node 1, and the select_cnt of node 1 becomes 1. This is how the statement level load balancing works.
This comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
DeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteHi,
ReplyDeleteIf Client calls one PostgreSQL Function which contains both select and update statements, In this case Select will redirect to Read-only Database node and update statements will redirect to Read Write Database node? Kindly help on this as my client modules will call many functions which contains select/update/delete statements.
If a function includes any writes (update, delete or insert), this should be marked as "volatile" (https://www.postgresql.org/docs/13/xfunc-volatility.html).
ReplyDeleteIf pgpool finds a volatile function in an SQL statement, pgpool will redirect the whole SQL statement to primary node.
i.e. pgpool will not redirect each statement in a function to both primary and standby.
Thank you very much for your support. It helps me to take decision on ReadWrite/ReadOnly DB's load balancing.
Delete