[pgpool-hackers: 4355] Re: Proposal: mitigating session disconnection issue while failover

Tue Jul 18 10:39:00 JST 2023

> Background:
> 
> Currently Pgpool-II disconnects client sessions in failover or backend
> error. This is fine because the client needs to access the PostgreSQL
> backend anyway. But even in the case when the client does not use
> particular backend, the client session is disconnected in
> failover. This is not good. Suppose we have 3 streaming replication
> PostgreSQL cluster and the client uses primary (node 0) and standby 1
> (node 1), but does not use standby 2 (node 2). In this case ideally
> shutting down node 2 should not disconnect the session. However the
> session is disconnected if the session sends query to Pgpool-II while
> failover. This is necessary because there are bunch of places in the
> source code something like this:
> 
> for (i = 0; i < NUM_BACKENDS; i++)
> {
> 	if (!VALID_BACKEND(i))
> 	   continue;
> 	   :
> 	   :
> 
> Here, NUM_BACKENDS represents the number of PostgreSQL backends (in
> the case above it's 3). VALID_BACKEND returns true if the backend is
> not in down status. If this code is executed while failover, the code
> may access the backend socket which is not available any more and will
> cause troubles including segfault. So inside VALID_BACKEND, we check
> whether failover is performed, and if so, the pgpool child process
> exits and the session disconnects.
> 
> Proposal:
> 
> In this proposal I would like to mitigate the issue above in certain
> cases.  This proposal does not resolve all cases. Still some session
> disconnection cases will remain. In all cases the precondition is, the
> client does not use the backend which is the target of failover. To
> make the proposal easier to understand, I supopose the session uses
> only node 0 and/or node 1, and the failover target is node 2. Usually
> to make this possible, we need to set backend_weight2 = 0 or
> load_balance_mode = off.
> 
> case 1:
> 
> Failover on node 2 occurs while the session keeps on sending queries
> to node 0 and/or 1. Change VALID_BACKEND so that it waits for
> completion of failover. For this purpose new function
> wait_for_failover_to_finish() is added. It waits for the completion of
> failover up to MAX_FAILOVER_WAIT seconds (for now it's 30). It maybe
> better to make the wait time configurable. Thoughts?
> 
> case 2:
> 
> Failover on node 2 occurs while not only the session keep on sending
> queries to node 0 and/or 1, but new session is created. This is much
> harder than case 1. There are multiple places where session
> disconnection could occur.
> 
> - accepting new connection from client. In wait_for_new_connections,
>   call wait_for_failover_to_finish to wait for completion of
>   failover.
> 
> - creating new connection to backend. After accepting connection
>   request from client and before creating connection to backend, call
>   wait_for_failover_to_finish to wait for completion of failover.
> 
> - fixing broken socket. pool_get_cp checks whether exiting backend
>   connection is broken. If it's broken, sleep 1 second to expect
>   failover happens then calls wait_for_failover_to_finish().
> 
> - processing an application name. If an application name is included
>   in a startup message, pgpool sends query like "SET application_name
>   TO foo" to all backend nodes including node 2, which could cause a
>   write error. To prevent the error, I modified
>   connect_using_existing_connection, which is sending the SET command
>   using do_command, so that do_command does not raise an ERROR by
>   wrapping it in TRY/CATCH block.
> 
> Note that even with all fixes above, I was not able to fix some cases
> where pool_write raises error. pool_write is used to write to backend
> socket and there are too many places to fix all of them. For now we
> need to run "pcp_detach_node 2" before shutdown it. pcp_detach_node
> will tell all pgpool child process that node 2 is going down. Even if
> a child process does not notice it and writes to backend 2 socket,
> there will be no error because node 2 is still alive.
> 
> Tests: I have created new test 079.failover_session for this
> feature. For the test pgbench is used. I can generate load using
> continuous session (without -C option) and repeating
> connection/disconnection (with -C option). There are 4 causes in the test:
> 
> "=== test1: backend_weight2 = 0 and pgbench without -C option"
> "=== test2: backend_weight2 = 0 and pgbench with -C option"
> "=== test3: load_balance_mode = off and pgbench without -C option"
> "=== test4: load_balance_mode = off and pgbench with -C option"
> 
> test2 and test4 requires pcp_detach_node before shutting down node 2.
> 
> Patch:
> 
> Attached is the v1 patch for this feature. Comments and suggestions
> are welcome.

Patch pushed to master branch.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp