Wednesday, August 7, 2019

Automatically updating materialized views

Materialized views are convenient and efficient way to retrieve information from database. Unlike ordinary views, materialized views save the query result and provide faster access to the data. For example, user can create a simple materialized view containing the number of rows in a table:

 CREATE MATERIALIZED VIEW mv1 AS SELECT count(*) FROM pgbench_accounts;

Obtaining the number of rows using the materialized view mv1 is much faster than  directly accessing number of rows in pgbench_accounts.

test=# SELECT * from v1;
 count 
--------
 100000
(1 row)

Time: 0.810 ms
test=# SELECT count(*) FROM pgbench_accounts;
 count 
--------
 100000
(1 row)

Time: 13.945 ms


However if you delete a row from pgbench_accounts, the number of rows in mv1 is still unchanged. To reflect the change of the base table (in this case pgbench_accounts) , you need to recreate or refresh (this actually recreate the contents of materialize views from scratch), which may take long time.

To overcome the problem, SRA OSS is proposing to add a new feature to existing materialized view  "incremental materialized view maintenance". This works like this.

  • add new syntax to allow incrementally update the materialized view when it is created.
  • install triggers automatically to be attached to the base table(s).
  • when UPDATE/DELETE/INSERT applies on the base tables, the trigger calculate the difference to the rows in the materialized view and update the rows.
These should update the materialized view in a real time manner. The price is the over head of trigger invocation. So frequently updating tables are not best suite for the incremental materialized view maintenance. Here are demonstrations how to use the incrementally updating materialized view.

First create incrementally updating materialized view.

CREATE INCREMENTAL MATERIALIZED view mv2 AS SELECT count(*) FROM pgbench_accounts;

The view initially says there are 100,000 rows.

SELECT * FROM mv2;
 count 
--------
 100000
(1 row)


Delete a row from pgbench_accounts. Now the number of rows is 99999.

DELETE FROM pgbench_accounts WHERE aid = 10;
DELETE 1
SELECT count(*) FROM pgbench_accounts;
 count
-------
 99999
(1 row)


Make sure that mv2 reports the same number of rows.

SELECT * FROM mv2;
 count
-------
 99999
(1 row)


It works!

This is a very simple materialized view. More complex queries, for example inner joins also work.

CREATE INCREMENTAL MATERIALIZED VIEW mv3 AS SELECT a.aid, b.bid, t.tid FROM pgbench_accounts a INNER JOIN pgbench_branches b ON (a.bid = b.bid ) INNER JOIN pgbench_tellers t ON (b.bid = t.bid) WHERE a.aid BETWEEN 1 AND 5;
SELECT 50


SELECT * FROM mv3 LIMIT 5;
 aid | bid | tid
-----+-----+-----
   3 |   1 |  10
   2 |   1 |   1
   2 |   1 |   5
   2 |   1 |   4
   5 |   1 |   3

 (5 rows)

DELETE FROM pgbench_accounts WHERE aid = 2;
DELETE 1


SELECT * FROM mv3 LIMIT 5;
 aid | bid | tid
-----+-----+-----
   3 |   1 |  10
   5 |   1 |   3
   1 |   1 |   1
   5 |   1 |   8
   4 |   1 |   3
(5 rows)


Implementing other types of queries such as outer joins, self joins are in our plan, possibly toward PostgreSQL 13.

Here are some pointers to information regarding our incremental materialized view maintenance project:

Dynamic spare process management in Pgpool-II

Pre-fork architecture in Pgpool-II Pgpool-II uses fixed number of pre-forked child process which is responsible for accepting and handling e...