Tuesday, August 16, 2016

Importing PostgreSQL 9.6's SQL parser

In almost every releases of Pgpool-II, we import the latest version of PostgreSQL's SQL parser (more precisely, the raw parser). This time, our new Pgpool-II developer faced with an interesting problem while importing PostgreSQL 9.6's parser.

In PostgreSQL, the SQL parser is written in bion, a general-purpose parser generator. The particular file including the SQL grammar rules is named "gram.y". gram.y used to include "scan.l", which is a lexical scanner written in flex.

 In reality, gram.y is translated into a C source file by bison, then compiled. Same thing can be said to scan.l, which is translated by flex though.

So the main part of SQL parser  source file was single big file consisted of gram.y and scan.l.

From PostgreSQL 9.6, however, PostgreSQL developers decided to keep gram.y and flex.l separated.

Build backend/parser/scan.l and interfaces/ecpg/preproc/pgc.l standalone.

This gives enough confusion to the developer in charge of the work and took some time before realize the change. I would say it's a fun part of the task when we work on an OSS project:-)  However I cannot stop saying that it would be nice if the SQL parser is exported as a separate library so that we do not need this kind of work in every releases of Pgpool-II.

Monday, July 25, 2016

New Pgpool-II mascot logo!

We now have new  Pgpool-II mascot logo!

This image was created by a professional designer and donated by SRA OSS, Inc. Japan.

You can find a few variations of the image at the Pgpool-II's official site.
The images are licensed under CCL. So you are allowed to have those images attached to your site if you like. Even you could customize or arrange the images!


Sunday, July 24, 2016

Avoiding session disconnection while fail over

Your client session to Pgpool-II will be disconnected once fail over or switch over happens. Pretty annoying. This is because Pgpool-II kills all child process that are responsible for each client session. Pgpool-II 3.6 will mitigate this under certain conditions:

  1. Pgpool-II operates in streaming replication mode
  2. The failed DB node is not the primary (master) node
  3. Your "load balance node" is not the failed node
1 & 2 are easy to understand. What about #3?

The load balance node is chosen when you connect to Pgpool-II. Pgpool-II assigns one of the DB nodes to send read only  queries to. It is decided according some of pgpool.conf settings:

  •  load_balance_mode (of course this should be "on")
  • "weight" parameter
  • database_redirect_preference_list
  • app_name_redirect_preference_list

The decision which DB node to choose is done at the early stage of session connection and the assignment will not be changed until you exit the session. From Pgpool-II 3.6, you can check your load balance node by using "show pool_nodes" command.

$  psql -p 11000 test
test=# show pool_nodes;
  node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node
 0       | /tmp     | 11002 | 2      | 0.333333  | primary | 0          | false
 1       | /tmp     | 11003 | 2      | 0.333333  | standby | 0          | false
 2       | /tmp     | 11004 | 2      | 0.333333  | standby | 0          | true
(3 rows)

Here "load_balance_node" is the DB node chosen for the "load balance node".

If other than node 2 is going down and the node is not primary, this session will not be disconnected. In this case the session will not be disconnected if node 1 goes down. Let's try that using another terminal:

$ pg_ctl -D data1 -m f stop
waiting for server to shut down.... done
server stopped

Ok, let's input something in the previous psql session:

test=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node
 0       | /tmp     | 11002 | 2      | 0.333333  | primary | 0          | false
 1       | /tmp     | 11003 | 3      | 0.333333  | standby | 0          | false
 2       | /tmp     | 11004 | 2      | 0.333333  | standby | 0          | true
(3 rows)

As you can see, the session was not disconnected and you see the "status" column of node 1 is now changed to "3", which means the node is in down status.

Now let's suppose you want to maintain one of DB nodes. In this case, you could apply following procedure:

  1.  Edit pgpool.conf to change "backend_weight1" parameter to 0. This will prevent new sessions to choose the node 1 as the load balance node.
  2.  Wait until all users who are using node 1 as the load balance node exit session

Monday, June 20, 2016

Avoiding connection failure while health checking

Hi pgpool-II users,

I've managed to come back to this blog after 3 years.

Last week pgpool-II developer team released minor versions from 3.1 to 3.5. This release includes a special gift for users: enhancement for health checking.

You might notice an annoying behavior of pgpool-II.

Example: suppose we have three PostgreSQL backends managed by pgpool-II. pgpool-II occasionally checks the health of each backend if "health check" is enabled. If backend #2 goes down, a fail over is triggered and after that, users can use the DB server cluster without backend #2. This is great. However, if you set up the retrying of health checking, clients cannot connect to pgpool-II while it is retrying the health check. For instance,

health_check_max_retries = 10
health_check_retry_delay = 6

will continue the health check retry for 10*6 = 60 seconds at least. This is very annoying for users because they cannot initiate new connections to pgpool-II for 1 minute. Making these parameters shorter might mitigate the situation a little bit but this may not be useful if the network is not stable and longer retries are desirable.

These new releases significantly enhance the situation. By setting:

fail_over_on_backend_error = off

when a user connects to pgpool-II while it is health checking, it starts to connect to all backends including #2. Before this release, pgpool-II gave up initiating session if one of backend is not available (in this case #2). With this release, pgpool-II skips the broken backend and continues to connect to the rest of the backends. Please note, however, this feature is only available when all of the conditions below are met:

  • streaming replication mode
  • the broken backend is not primary server
  • fail_over_on_backend_error  is off

This enhancement is  available in all of the new releases: 3.5.3, 3.4.7, 3.3.11, 3.2.16, 3.1.19.