Tuesday, June 15, 2010

pgpool-II and Hot Standby

PostgreSQL 9.0 will have a great feature called "Hot Standby". Unlike warm standby, which is just waiting for in case of master server goes down, it gives an access to the standby server for lots of useful purposes such as complex reporting query. However only read queries and other several commands are allowed to be sent to the standby server. According to the PostgreSQL 9.0 beta2 document, this is the list of allowed queries:
  • Query access - SELECT, COPY TO including views and SELECT rules
  • Cursor commands - DECLARE, FETCH, CLOSE
  • Parameters - SHOW, SET, RESET
  • Transaction management commands
  • BEGIN, END, ABORT, START TRANSACTION
  • SAVEPOINT, RELEASE, ROLLBACK TO SAVEPOINT
  • EXCEPTION blocks and other internal subtransactions
  • LOCK TABLE, though only when explicitly in one of these modes: ACCESS SHARE, ROW SHARE or ROW EXCLUSIVE.
  • Plans and resources - PREPARE, EXECUTE, DEALLOCATE, DISCARD
  • Plugins and extensions - LOAD
No problem for PostgreSQL experts of course. But I doubt it is super easy for ordinary PostgreSQL users to know which query is allowed and which is not just looking at the list.

pgpool-II 3.0 (yes, that will be the new version number) will give you small help. Pgpool parses the query and automatically redirect it to the standby server if needed. If you have more than one standby servers, pgpool will choose one of them and send the query to it. The queries not allowed for the standby server, will be sent to the primary server. In short, you do not need to worry about the complex rule above if you use pgpool-II 3.0 + Hot Standby.

Please note that there is a limitation here. Pgpool parses the query by using "raw parser" which understands the syntax of the query but does not understand the "depth psychology" aspect of the query. For example, if a SELECT calls a custom C function which modifies the database, pgpool will send it to the standby server because it does not know that the SELECT actually does write.

It will be great that PostgeSQL's function does have a signature that confess it is going to write...

6 comments:

  1. Couldn't it use the optimization flags IMMUTABLE/VOLATILE ? This, of course, would assume the developer uses this flag, and as the default is VOLATILE I think it would be pretty safe.

    ReplyDelete
  2. Tom Lane said no. See:
    http://archives.postgresql.org/pgsql-hackers/2010-02/msg01872.php

    ReplyDelete
  3. In our use-case, we are practically never using UPDATE statements directly, all data changes are wrapped into the stored procedures, and in this case this would make pgpool unusable with hot-standby if there is no way to mark the statement as a 'read/write'.

    Maybe you could add some kind of additional comment PRAGMA (/* READ-ONLY */ and /* READ-WRITE */ ) like you have for "NO INSERT LOCK"?

    ReplyDelete
  4. We have some write queries in a function/stored procedure and facing the same issue. pgpool send them to Slave because we call the function using "Select". Here, we have one Master and one Slave. In pgpool.conf, backend_weight0 = 0 and backend_weight1 = 1. Is this fixed in pgpool now? Or is there any solution to this? When I set backend_weight0 = 1, pgpool sometimes sends it to Master and sometimes to Slave.

    ReplyDelete
  5. Thanks Tatsuo to point me to black_function_list directive in pgpool.conf through the forums.It works like a charm.

    ReplyDelete

Row pattern recognition feature for PostgreSQL

What is row pattern recognition feature? Row pattern recognition (RPR) is a feature defined in the SQL standard. It allows to search for a s...