- Pgpool-II operates in streaming replication mode
- The failed DB node is not the primary (master) node
- Your "load balance node" is not the failed node
The load balance node is chosen when you connect to Pgpool-II. Pgpool-II assigns one of the DB nodes to send read only queries to. It is decided according some of pgpool.conf settings:
- load_balance_mode (of course this should be "on")
- "weight" parameter
- database_redirect_preference_list
- app_name_redirect_preference_list
The decision which DB node to choose is done at the early stage of session connection and the assignment will not be changed until you exit the session. From Pgpool-II 3.6, you can check your load balance node by using "show pool_nodes" command.
$ psql -p 11000 test
test=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node
---------+----------+-------+--------+-----------+---------+------------+-------------------
0 | /tmp | 11002 | 2 | 0.333333 | primary | 0 | false
1 | /tmp | 11003 | 2 | 0.333333 | standby | 0 | false
2 | /tmp | 11004 | 2 | 0.333333 | standby | 0 | true
(3 rows)
Here "load_balance_node" is the DB node chosen for the "load balance node".
If other than node 2 is going down and the node is not primary, this session will not be disconnected. In this case the session will not be disconnected if node 1 goes down. Let's try that using another terminal:
$ pg_ctl -D data1 -m f stop
waiting for server to shut down.... done
server stopped
Ok, let's input something in the previous psql session:
test=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node
---------+----------+-------+--------+-----------+---------+------------+-------------------
0 | /tmp | 11002 | 2 | 0.333333 | primary | 0 | false
1 | /tmp | 11003 | 3 | 0.333333 | standby | 0 | false
2 | /tmp | 11004 | 2 | 0.333333 | standby | 0 | true
(3 rows)
As you can see, the session was not disconnected and you see the "status" column of node 1 is now changed to "3", which means the node is in down status.
Now let's suppose you want to maintain one of DB nodes. In this case, you could apply following procedure:
- Edit pgpool.conf to change "backend_weight1" parameter to 0. This will prevent new sessions to choose the node 1 as the load balance node.
- Wait until all users who are using node 1 as the load balance node exit session
This is why I always suggest putting PgBouncer in front of pgpool2. If pgpool acted as an actual pooling service, it would proxy and hide the back-end connections. Connections to pgpool are separate from connections to Postgres, so if a failover occurs, it should simply re-bind all connections to the new master, or make new connections as necessary without breaking connections to itself. The fact it doesn't do this pretty much requires the use of a secondary proxy like PgBouncer.
ReplyDeleteWith PgBouncer in place, it acts as a connection aggregator that hides the fact pgpool is throwing away connections during a failover. This makes the process much more transparent. There may be a query delay during a failover, but no disconnection.
Of course, you can get the same effect with HAProxy, etcd, and Governor or Patroni. Pgpool is perfectly positioned to handle all of these roles, so the fact that it doesn't is somewhat frustrating. :(
Hey Shaun,
DeleteThanks for your insight into this, I'm working on a project right now using pgpool2 for load balancing, failover with watchdog, and connection pooling. And i'm running into this particular issue when testing failover.
Do you have pgbouncer just pointing to the pgpool2 service/port and then having pgpool connect into postgresql?
Thanks!
I wonder how PgBouncer re-binds connections without breaking some session local properties. I mean, for example, if a user issued "SET seq_page_cost = 2.0" in a session, and PgBouncer re-binds to another session, then seq_page_cost would be changed to the default value, probably 1.0?
DeleteI have tried PgBouncer on top of PgPool-II but still connections are getting diconnected when the failover occurred on master node. So i don't think PgBouncer will help here or i am missing any configuration which can help.
DeleteThis comment has been removed by a blog administrator.
ReplyDeleteHi All,
ReplyDeleteI have tried PgBouncer on top of PgPool-II but still connections are getting diconnected when the failover occurred on master node. So i don't think PgBouncer will help here or i am missing any configuration which can help.
Any other ideas on it?
regards,
Nabeel
This comment has been removed by the author.
DeleteI think you need to put PgBouncer behind PgPool-II, not in front of it, i.e.:
ReplyDeleteclient --> pgpool-ii --> pgbouncer --> cluster