Playing with PostgreSQL and Pgpool: December 2020

Sunday, December 27, 2020

Timeouts in Pgpool-II connections

There are several timeout parameters in Pgpool-II. In this blog I will explain those parameters used for connection management. Below is a figure describing such a few parameters. Pgpool-II spawns child process (the number of process is defined by num_init_children parameter) at the startup time. Each child process takes care of client connections and backend connections to the PostgreSQL database.

First of all there are two connections. One is between clients and Pgpool-II. The other is between Pgpool-II and PostgreSQL. So you should recognize which parameter affects which connection.

Child_lifetime

This parameter controls the lifetime of Pgpool-II process. Since connections to/from clients and PostgreSQL are in the process, all connections will disappear when the process exits. A benefit to enable the parameter is to free out memories used by the Pgpool-II process. For example, Pgpool-II needs to allocate memory to hold message data sent from client. That could be large if table row is wide. Of course the memory used by the message is freed after the communication is done. But it does not actually free out memory from process space. So occasional process exit and restart may be good for applications which could handle large messages. Note that when Pgpool-II process exits, corresponding PostgreSQL also exits.

The timer for the parameter starts when no connections from clients exist. For a busy system the timer may not be triggered because next client connection arrives as soon as former client exits. In this case you may want to use child_max_connections parameter.

If you want to know how old the process is, you can use pcp_proc_info command or show pool_processes SQL command to examine the "start time" field.

`Child_max_connections`

If the number of connection count from clients exceeds this parameter, then the Pgpool-II process exits and restarts. Even if your system is too busy to trigger the child_lifetime, this parameter should work because it only counts the number of connections from clients.

Unfortunately currently there's no metrics to know how many connections from client have been made to a Pgpool-II process.

`Client_idle_limit`

If a client does not send next query for a period of this parameter, the connection to the client is disconnected. Note that the timer starts after previous query receives the result from PostgreSQL. Therefore bellow query does not trigger the disconnection even if client_idle_limit = 10 (seconds) until the client receives the result.

SELECT pg_sleep(60);

`Connection_life_time`

Unlike the parameters above this controls the lifetime of cached connections to PostgreSQL backend. Note that Pgpool-II keeps up to max_pool connections to PostgreSQL backend. The timer is checked against those connections to PostgreSQL backend when the Pgpool-II process returns to the idle state (waiting for connections from clients). So if a client occupies the connection for long time, the cached connections remain longer than the period specified by the parameter.

If you want to know how old the connection caches are, you can use pcp_proc_info command or show pool_processes SQL command to examine the "Creation time" field.

`Conclusion`

You can control the connection lifetime parameters to save resource of the system. For a systems which are not so busy, you can leave those parameters as default. However if your system getting busy you might want to tweak them. Hope this blog entry is useful for busy systems.

Wednesday, December 2, 2020

Using IVM in a docker container

Real time, fast update of materialized views

In this blog entry I have introduced the ongoing project: Incremental View Maintenance (IVM), which allows to real time, fast update of materialized views. Since the bolg, IVM now can handle outer joins and self joins. The out come of the project is in public in a form of set of patches against git master branch of PostgreSQL. Also a GitHub repository for this project is in public as well.

Docker image for IVM

So skilled PostgreSQL developers can test IVM today. However it's not that easy for people who do not regularly compile and build PostgreSQL and IVM from source code. Fortunately one of IVM developers "Yugo Nagata" comes up and starts to provide complete docker image of IVM. I am going to demonstrate how to use it step by step.

# Pull docker image

docker pull yugonagata/postgresql-ivm

# Run docker container using port mapping 15432:5432

docker run -d --name pgsql-ivm -e POSTGRES_PASSWORD=postgres -p 15432:5432 yugonagata/postgresql-ivm

# Run psql to connect to PostgreSQL to create database named "test"

docker run -it --rm --link pgsql-ivm yugonagata/postgresql-ivm psql -h pgsql-ivm -U postgres

# create database test;

# Run psql to run pgbench
docker run -it --rm --link pgsql-ivm yugonagata/postgresql-ivm pgbench -i -h pgsql-ivm -U postgres test

# Run psql to try IVM
docker run -it --rm --link pgsql-ivm yugonagata/postgresql-ivm psql -h pgsql-ivm -U postgres test

# Create a materialized view "mv1" for query "select count(*) from pgbench_accounts.

# Note that "pgbench_accounts" was created by pgbench
test=# create incremental materialized view mv1 as select count(*) from pgbench_accounts;
SELECT 1

# select the result from the view
test=# select * from mv1;
count
--------
100000
(1 row)

# Start a transaction

test=# begin;
BEGIN

# delete 1 row from pgbench_accounts table
test=*# delete from pgbench_accounts where aid = 1;
DELETE 1

# Check to see if mv1 reflects the deletion
test=*# select * from mv1;
count
-------
99999
(1 row)

# Abort the transaction
test=*# abort;
ROLLBACK

# The deletion was cancelled in mv1 as well
test=# select * from mv1;
count
--------
100000
(1 row)

Summary

IVM has been continuously developed. Now it can be easily tested in a docker container. Please try it out!

Playing with PostgreSQL and Pgpool