Wednesday, April 18, 2018

Detecting "false" primary server of PostgreSQL

One of upcoming Pgpool-II 4.0's interesting features will be: "false" primary server detection in streaming replication environment.

Suppose we have 1 primary server and two standby servers connection the primary server.

test=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------
 0       | /tmp     | 11002 | up     | 0.000000  | primary | 1          | false             | 0
 1       | /tmp     | 11003 | up     | 0.000000  | standby | 0          | false             | 0
 2       | /tmp     | 11004 | up     | 1.000000  | standby | 0          | true              | 0
(3 rows)


What will happen if the node 2 standby server is promoted to primary?

test=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------
 0       | /tmp     | 11002 | up     | 0.000000  | primary | 1          | false             | 0
 1       | /tmp     | 11003 | up     | 0.000000  | standby | 0          | false             | 0
 2       | /tmp     | 11004 | up     | 1.000000  | standby | 0          | true              | 0
(3 rows)


As you can see nothing has been changed as far as show pool_nodes command goes.
But: actually node 2 is not the standby connected to the primary anymore. So if large updates are sent to the primary, the node 2 is far behind the primary server since data is not replicated to the node any more.

t-ishii@localhost: pgbench -i -p 11000 test
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
creating tables...
100000 of 100000 tuples (100%) done (elapsed 0.19 s, remaining 0.00 s)
vacuum...
set primary keys...
done.


 test=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------
 0       | /tmp     | 11002 | up     | 0.000000  | primary | 1          | false             | 0
 1       | /tmp     | 11003 | up     | 0.000000  | standby | 0          | false             | 0
 2       | /tmp     | 11004 | up     | 1.000000  | standby | 0          | true              | 13100296
(3 rows)


How can we detect the situation and fix it?

Pgpool-II 4.0 will help you. It will have a new parameter called "detach_false_primary". If the parameter is enabled, Pgpool-II will automatically detect the situation and detach the node 2 because it's a"false" primary.

 test=# show pool_nodes;
 node_id | hostname | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------
 0       | /tmp     | 11002 | up     | 0.000000  | primary | 1          | false             | 0
 1       | /tmp     | 11003 | up     | 0.000000  | standby | 0          | false             | 0
 2       | /tmp     | 11004 | down   | 1.000000  | standby | 0          | true              | 13100296
(3 rows)


Note that there's a limitation in the feature. For example, if  there are 4 nodes: primary 1 connects to standby 1, while primary 2 connect to standby 2. There's no reliable way to decide which node is the "true" primary. In this case the false primary detection does not work.

In the figure bellow explains what Pgpool-II 4.0 can do and cannot.
You can read more details of the feature from the manual.


No comments:

Post a Comment

Dynamic spare process management in Pgpool-II

Pre-fork architecture in Pgpool-II Pgpool-II uses fixed number of pre-forked child process which is responsible for accepting and handling e...