Playing with PostgreSQL and Pgpool: 2018

Tuesday, December 4, 2018

log_client_messages in Pgpool-II 4.0

Pgpool-II 4.0 adds new logging feature called "log_client_messages". This allows to log messages coming from frontend. Up to 3.7 the only way to log frontend messages was enable debugging log, which produced tremendous amount of logs.

For example, with log_client_messages enabled, "pgbench -S -M parepared -t 2" produces frontend logs below:

2018-12-04 16:43:45: pid 6522: LOG: Parse message from frontend.
2018-12-04 16:43:45: pid 6522: DETAIL: statement: "P0_1", query: "SELECT abalance FROM pgbench_accounts WHERE aid = $1;"
2018-12-04 16:43:45: pid 6522: LOG: Sync message from frontend.
2018-12-04 16:43:45: pid 6522: LOG: Bind message from frontend.
2018-12-04 16:43:45: pid 6522: DETAIL: portal: "", statement: "P0_1"
2018-12-04 16:43:45: pid 6522: LOG: Describe message from frontend.
2018-12-04 16:43:45: pid 6522: DETAIL: portal: ""
2018-12-04 16:43:45: pid 6522: LOG: Execute message from frontend.
2018-12-04 16:43:45: pid 6522: DETAIL: portal: ""
2018-12-04 16:43:45: pid 6522: LOG: Sync message from frontend.

As you can see, pgbench sends a query "SELECT abalance FROM pgbench_accounts WHERE aid = $1;" using prepared statement "P0_1", then bind message to bind parameter to be bound to "$1".
It then sends describe message to obtain meta data, and finally sends execute message to run the query.

Below are the second execution of query (remember that we add "-t 2" parameter to execute 2 transactions).

2018-12-04 16:43:45: pid 6522: LOG: Bind message from frontend.
2018-12-04 16:43:45: pid 6522: DETAIL: portal: "", statement: "P0_1"
2018-12-04 16:43:45: pid 6522: LOG: Describe message from frontend.
2018-12-04 16:43:45: pid 6522: DETAIL: portal: ""
2018-12-04 16:43:45: pid 6522: LOG: Execute message from frontend.
2018-12-04 16:43:45: pid 6522: DETAIL: portal: ""
2018-12-04 16:43:45: pid 6522: LOG: Sync message from frontend.
2018-12-04 16:43:45: pid 6522: LOG: Terminate message from frontend.

This time no parse message is sent because pgbench reuses the named statement "P0_1", which eliminates the parse/analythis step. So pgbench runs in the mode very fast comparing with other mode.

In summary log_client_messages is useful when you want to extract very detail info regarding what the client is doing.

Friday, October 19, 2018

Pgpool-II 4.0 released! (SCRAM authentication)

After 1 year of hard work by Pgpool-II developers, Pgpool-II 4.0 is finally out.
This version celebrates the 15th anniversary of Pgpool-II, and introduces rich new features. I will explain them one by one in a series of blogs.

Support for SCRAM authentication

SCRAM authentication is a modern, secure authentication method supported by PostgreSQL 10 or later. Pgpool-II 4.0 supports the method by providing multiple new functionalities.

pool_hba.conf now has "scram-sha-256" authentication method.
If this method is specified, users connecting to Pgpool-II are required to response in proper way of SCRAM authentication.
Passwords stored in pool_passwd can be used to verify user's password before a session starts.
Passwords can be store in pool_passwd in multiple ways: clear text or AES256 encrypted (md5 hashed passwords can also be stored in pool_passwd but in this case SCRAM cannot use the password).
AES256 encryption is preferred and the key to decrypt it is stored in .pgpoolkey under the user's home directory who started Pgpool-II, or a file specified by PGPOOLKEYFILE environment variable.
For AES256 encryption, new tool pg_enc can be used.

Of course, Pgpool-II can authenticate itself in SCRAM while connecting to PostgreSQL, if it requires SCRAM authentication to Pgpool-II.

Good thing with Pgpool-II 4.0 is, it allows users to choose different authentication method in users vs. Pgpool-II and Pgpool-II vs. PostgreSQL. For example, using SCRAM between users and Pgpool-II, while using md5 authentication method between Pgpool-II and PostgreSQL.

SCRAM authentication can also be used for health check and replication delay check. Before 4.0, user name and password for them are stored in clear text format in pgpool.conf. This is not the most secure way. Pgpool-II 4.0 allows to store AES 256 encrypted passwords in pgpool.conf. .pgpookey is also used for decrypting these passwords.

Next time I will explain about other features of authentication method in Pgpool-II 4.0.

Friday, June 15, 2018

Even more fine load balancing control with Pgpool-II 4.0

While back go, I posted an article titled "More load balancing fine control" to explain a new feature of upcoming Pgpool-II 4.0. Today I would like talk about yet another new feature of load balancing in Pgpool-II.

Pgpool-II is already able to control read query load balancing in several granularity:
(See the document for more details).

Whether enable load balancing or not (load_balance_mode)
By database (database_redirect_preference_list)
By application name (app_name_redirect_preference_list)
By functions used in the query (white_function_list, black_function_list)
Statement level (/*NO LOAD BALANCE*/ comment)

The last method allows the statement level control on load balancing but it needs to rewrite a query which is often impossible if you are using commercial software.

A new configuration parameter called "black_query_pattern_list" allows you to disable load balancing for queries specified by the parameter. The parameter takes a regular expression string. If a query matched with the expression, load balancing is disabled for the query: the query is sent to the primary (master) PostgreSQL server.

Here are some examples.

We have:
black_query_pattern_list = 'SELECT \* FROM t1\;;'
in pgpool.conf. Note that some special characters must be qualified by using '\' (back slash) characters.

We have no SELECTs issued to both primary and standby.

test=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role   | select_cnt | load_balance_node | replication_delay | last_status_change
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
0       | /tmp     | 11002 | up     | 0.500000 | primary | 0          | false             | 0                 | 2018-06-15 11:57:25
1       | /tmp     | 11003 | up     | 0.500000 | standby | 0          | true              | 0                 | 2018-06-15 11:57:25
(2 rows)

If following query is issued, then the "select_cnt" column of standby should be incremented since standby is the load balance node (notice the "role" column").

test=# SELECT * FROM t1;

But because of the effect of black_query_pattern_list, the SELECT is redirected to primary.

test=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role   | select_cnt | load_balance_node | replication_delay | last_status_change
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
0       | /tmp     | 11002 | up     | 0.500000 | primary | 1          | false             | 0                 | 2018-06-15 11:57:25
1       | /tmp     | 11003 | up     | 0.500000 | standby | 0          | true              | 0                 | 2018-06-15 11:57:25
(2 rows)

However, "SELECT * FROM t1 WHERE i = 0" will be sent to standby since the expression specied in black_query_pattern_list does not match with the query.

test=# select * from t1 where i = 1;
i
---
(0 rows)

test=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role   | select_cnt | load_balance_node | replication_delay | last_status_change
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
0       | /tmp     | 11002 | up     | 0.500000 | primary | 1          | false             | 0                 | 2018-06-15 11:57:25
1       | /tmp     | 11003 | up     | 0.500000 | standby | 1          | true              | 0                 | 2018-06-15 11:57:25
(2 rows)

If you want to match with any query using table t2, you want to specify something like:

  .* t2;;.* t2 .*;

The first part ".* t2;" matches any query without any qualification (example: "SELECT * FROM t2"), while latter matches any query having space after t2 (example: "SELECT * FROM t2 WHERE i = 1).

Please note that Pgpool-II does not do any syntax analysis on the query here. Please be careful when specifying a regular expression. For instance, "SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2)" will not match with any of the regular expression pattern above. So the query will be sent to standby. This may or may not follow your expectations.

By the way, you may already noticed the extra column "last_status_change" of "show pool_nodes" command. This indicates the times stamp when any of "status" or "role" has been changed. This is useful when you want to see log lines when failover happend last time.

Friday, May 18, 2018

statement_timeout changed in PostgreSQL 11

PostgreSQL allows to set time limit for each SQL statement. For example,

SET statement_timeout to '4s';

will set the upper limit of execution time for each subsequent query to 4 seconds. So,

SELECT pg_sleep(5);

will cancel 4 seconds after.

ERROR: canceling statement due to statement timeout

Simple and pretty easy to understand.

But if it is applied to extended queries, things are not so simple any more. Each extended query is divided into multiple phases:

Parse: parse SQL statement
Bind: bind prameters to the parsed SQL statement
Execute: run the query
Sync: ask PostgreSQL to return the query results

So when the statement timeout fires? The answer is 4, not 3. Until sync message is recieved, the statement timeout will not be checked in extended queries.

Ok. Consider next example. We assume that statement timeout has been set to 4 seconds.

Parse: SELECT pg_sleep(2) (Query A)
Bind: bind prameters to the parsed SQL statement A
Execute: run the query A
Parse: SELECT pg_sleep(3) (Query B)
Bind B
Execute B
Sync

This time, the statment timeout will be fired even if each query A and B is finished within 4 seconds. This is not very intuitive behavior I think. More over, if duration log is enabled, the time for each query A and B will be 2 seconds and 3 seconds of course. So users will be confused because despite the fact that each query definitely finishes within 4 seconds, the statement timer is fired.

From PostgreSQL 11, this behavior will be changed. The statement timeout will be checked at the time when Execute message is issued (3 and 6 above). So the statement timer will not be fired in the example.

In summary, statement timeout in PostgreSQL 11 will show more intuitive behavior than previous releases.

Tuesday, April 24, 2018

More load balancing fine control

Pgpool-II is known as it provides load balancing across multiple PostgreSQL servers for read queries.

Consider a cluster consisting of Pgpool-II and two PostgreSQL servers configured with streaming replication. One is a primary server, the other is a standby server. Also the load balancing weight is 0:1, which means read queries will be basically sent to the standby server.

In an explicit transaction if a read query is sent from a client, Pgpool-II sends it to the standby server (as long as replication delay is not too much). Next a write query is coming. Of course it is sent to the primary server. Interesting thing happens if a read query is sent. Unlike previous read query, it is sent to the primary to avoid the replication delay.

Pgpool-II 4.0 will have a new configuration parameter called "disable_load_balance_on_write" to give more fine control of the behavior above.

If disable_load_balance_on_write = 'transaction', the behavior is exactly same as above. This is the default.

If disable_load_balance_on_write = 'off', Pgpool-II no longer takes account of the write query and feels free to load balance read queries as much as possible.

This choice is good if you want to take maximum benefit of load balancing.

The variant of 'tranacton' is, 'trans_transaction'. Unlike 'transaction', the effect of disable load balancing after a write query continues even if the transaction closes and a next transaction begins.

So 'trans_transaction' is good for those who want data consistency than load balancing. Please note that a ready query issued between explicit transactions are not affected by the parameter.

Finally there's 'always'. If the mode is specified and once write query is sent, any read query regardless it's in an explicit transaction or not, will not be load balanced and sent to the primary server until the session ends.

In summary, the degree of protecting data consistency is in the order of always, trans_transaction, transaction and off. In contrast, the degree of performance is in the order of off, transaction, trans_transaction and always. If you are not sure which one is the best choice for you, I recommend you to start with 'transaction' because this is on a good balance between data consistency and performance.

Wednesday, April 18, 2018

Detecting "false" primary server of PostgreSQL

One of upcoming Pgpool-II 4.0's interesting features will be: "false" primary server detection in streaming replication environment.

Suppose we have 1 primary server and two standby servers connection the primary server.

test=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role   | select_cnt | load_balance_node | replication_delay
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------
0       | /tmp     | 11002 | up     | 0.000000 | primary | 1          | false             | 0
1       | /tmp     | 11003 | up     | 0.000000 | standby | 0          | false             | 0
2       | /tmp     | 11004 | up     | 1.000000 | standby | 0          | true              | 0
(3 rows)

What will happen if the node 2 standby server is promoted to primary?

test=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role   | select_cnt | load_balance_node | replication_delay
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------
0       | /tmp     | 11002 | up     | 0.000000 | primary | 1          | false             | 0
1       | /tmp     | 11003 | up     | 0.000000 | standby | 0          | false             | 0
2       | /tmp     | 11004 | up     | 1.000000 | standby | 0          | true              | 0
(3 rows)

As you can see nothing has been changed as far as show pool_nodes command goes.
But: actually node 2 is not the standby connected to the primary anymore. So if large updates are sent to the primary, the node 2 is far behind the primary server since data is not replicated to the node any more.

t-ishii@localhost: pgbench -i -p 11000 test
NOTICE: table "pgbench_history" does not exist, skipping
NOTICE: table "pgbench_tellers" does not exist, skipping
NOTICE: table "pgbench_accounts" does not exist, skipping
NOTICE: table "pgbench_branches" does not exist, skipping
creating tables...
100000 of 100000 tuples (100%) done (elapsed 0.19 s, remaining 0.00 s)
vacuum...
set primary keys...
done.

test=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role   | select_cnt | load_balance_node | replication_delay
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------
0       | /tmp     | 11002 | up     | 0.000000 | primary | 1          | false             | 0
1       | /tmp     | 11003 | up     | 0.000000 | standby | 0          | false             | 0
2       | /tmp     | 11004 | up     | 1.000000 | standby | 0          | true              | 13100296
(3 rows)

How can we detect the situation and fix it?

Pgpool-II 4.0 will help you. It will have a new parameter called "detach_false_primary". If the parameter is enabled, Pgpool-II will automatically detect the situation and detach the node 2 because it's a"false" primary.

test=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role   | select_cnt | load_balance_node | replication_delay
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------
0       | /tmp     | 11002 | up     | 0.000000 | primary | 1          | false             | 0
1       | /tmp     | 11003 | up     | 0.000000 | standby | 0          | false             | 0
2       | /tmp     | 11004 | down   | 1.000000 | standby | 0          | true              | 13100296
(3 rows)

Note that there's a limitation in the feature. For example, if there are 4 nodes: primary 1 connects to standby 1, while primary 2 connect to standby 2. There's no reliable way to decide which node is the "true" primary. In this case the false primary detection does not work.

In the figure bellow explains what Pgpool-II 4.0 can do and cannot.
You can read more details of the feature from the manual.

Monday, April 16, 2018

Celebrating 15th anniversary of Pgpool and next version number

It will be 15 years in coming July since Pgpool was born. At that time Pgpool was just a simple connection pooler program, no replication, no load balancing. Many contributors have been working hard on enhancing Pgpool since then.

Pgpool-II developers have decided last week that upcoming Pgpool-II's major version will be 4.0, rather than 3.8 to celebrate the anniversary. Of course that's not the only reason for the version number. As usual Pgpool-II 4.0 will have exciting new features including enhanced authentication mechanism and more. Stay tuned.

Friday, February 16, 2018

Let's start Pgpool-II 3.8 development!

We have started the Pgpool-II 3.8 development cycle.

https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=fb831287a2ed40505cb9a37136d1f0d0c5d5b22f

New items for 3.8 are summarized in the Pgpool-II development wiki page.

In the wiki page, pinkish items are possible new features for 3.8. Discussions are in pgpool-hackers mailing list as usual.

There are some interesting features including SCRAM authentication in Pgpool-II. So please join the discussion!

Playing with PostgreSQL and Pgpool