Saturday, September 21, 2013

Health check parameters

Recently I got questions on pgpool-II's health check parameters. In this article I will try to explain them.

"Health check" is a term used in pgpool-II. Pgpool-II occasionally checks if PostgreSQL is alive or not by connecting to it and we call it "health check".
There are four parameters to control the behavior of the health check.

health_check_period

 This parameter defines the interval between the health check in seconds. If set to 0, the health check is disabled. The default is 0.

health_check_timeout

This parameter controls the timeout before giving up the connecting attempt to PostgreSQL in seconds. The default is 20. Pgpool-II uses socket access system calls such as connect(), read(), write() and close(). These system calls could hang if the network connection between pgpool-II and PostgreSQL is broken, and the hung could last until the TCP stack in the kernel gives up. This could be as long as two hours in most operating systems.  Apparently this is not good. The solution is setting a timeout before calling those system calls: health_check_timeout. Please note that health_check_timeout must be shorter enough than health_check_period. For example, If health_check_timeout is 20, health_check_period should be 30 or more.

health_check_max_retries

health_check_retry_delay

Sometimes network connections can be temporary unstable for various reasons. If health_check_max_retries is greater than 0, pgpool-II tries to repeat the health check up to health_check_max_retries times or succeeded in the health check. Interval between each retry is defined by health_check_retry_delay. The default for health_check_max_retries is 0, which disables the retry. The default for health_check_retry_delay is 1 (second).

Please note that "health_check_max_retries * (health_check_timeout+health_check_retry_delay)" should be smaller than health_check_period.

Following setting satisifes the formula.

health_check_period = 40
health_check_timeout = 10
health_check_max_retries = 3
health_check_retry_delay = 1

Please refer to pgpool-II document for more details.
http://www.pgpool.net/mediawiki/index.php/Documentation#Official_documentation

Pgpool-II new minor versions are out

pgpool-II new minor versions are out. Pgpool-II is a connection pooling/clustering tool for PostgreSQL ONLY. This time we released:
  • 3.3.1 (the latest stable version)
  • 3.2.6
  • 3.1.9
  • 3.0.13
These versions fix manu bugs of previous releases of pgpool-II. Please visit  http://www.pgpool.net/mediawiki/index.php/Main_Page#pgpool-II_3.3.1.2C_3.2.6.2C_3.1.9.2C_3.0.13_officially_released_.282013.2F09.2F06.29 to find more info.

Pgpool-II 3.3 released

Finaly we released new major release: pgpool-II 3.3.

This version focuses on enhancing "watchdog" module of pgpool-II.  It is in chage of avoiding the single point of failure caused by pgpool-II itself. Because pgpool-II is a proxy program for PostgreSQL, dying of pgpool-II immediately causes a service down of entire database system. Traditionally we deal with the problem by using two pgpool-II instances and "pgpool-HA", a heart beat script. Watchdog is a replacement for pgpool-HA. Users do not need to install pgpool-HA anymore. Just install two (or more) pgpool-II instances and turn on watchdog. Watchdog appeared in pgpool-II 3.2 and now it is far enhanced in 3.3.

Enhancements for watchdog includes:
  • New monitoring method of watchdog lifecheck using heartbeat signal
  • Interlocking of failover/failback script
  • Secure watchdog communication
  • PCP command for retrieving the watchdog status
  Other enhancements in 3.3 include:
  • import PostgreSQL 9.2 raw parser
  • New pgpool_setup tool
  • Support for using CREATE EXTENSION to install pgpool specific extensions
  • regression test suit 
  • new simple installer
Please visit  http://www.pgpool.net/mediawiki/index.php/Main_Page#pgpool-II_3.3_and_pgpoolAdmin_3.3_officially_released_.282013.2F07 to find more info.

Dynamic spare process management in Pgpool-II

Pre-fork architecture in Pgpool-II Pgpool-II uses fixed number of pre-forked child process which is responsible for accepting and handling e...