[pgpool-hackers: 4355] Re: Proposal: mitigating session disconnection issue while failover
Tatsuo Ishii
ishii at sraoss.co.jp
Tue Jul 18 10:39:00 JST 2023
> Background:
>
> Currently Pgpool-II disconnects client sessions in failover or backend
> error. This is fine because the client needs to access the PostgreSQL
> backend anyway. But even in the case when the client does not use
> particular backend, the client session is disconnected in
> failover. This is not good. Suppose we have 3 streaming replication
> PostgreSQL cluster and the client uses primary (node 0) and standby 1
> (node 1), but does not use standby 2 (node 2). In this case ideally
> shutting down node 2 should not disconnect the session. However the
> session is disconnected if the session sends query to Pgpool-II while
> failover. This is necessary because there are bunch of places in the
> source code something like this:
>
> for (i = 0; i < NUM_BACKENDS; i++)
> {
> if (!VALID_BACKEND(i))
> continue;
> :
> :
>
> Here, NUM_BACKENDS represents the number of PostgreSQL backends (in
> the case above it's 3). VALID_BACKEND returns true if the backend is
> not in down status. If this code is executed while failover, the code
> may access the backend socket which is not available any more and will
> cause troubles including segfault. So inside VALID_BACKEND, we check
> whether failover is performed, and if so, the pgpool child process
> exits and the session disconnects.
>
> Proposal:
>
> In this proposal I would like to mitigate the issue above in certain
> cases. This proposal does not resolve all cases. Still some session
> disconnection cases will remain. In all cases the precondition is, the
> client does not use the backend which is the target of failover. To
> make the proposal easier to understand, I supopose the session uses
> only node 0 and/or node 1, and the failover target is node 2. Usually
> to make this possible, we need to set backend_weight2 = 0 or
> load_balance_mode = off.
>
> case 1:
>
> Failover on node 2 occurs while the session keeps on sending queries
> to node 0 and/or 1. Change VALID_BACKEND so that it waits for
> completion of failover. For this purpose new function
> wait_for_failover_to_finish() is added. It waits for the completion of
> failover up to MAX_FAILOVER_WAIT seconds (for now it's 30). It maybe
> better to make the wait time configurable. Thoughts?
>
> case 2:
>
> Failover on node 2 occurs while not only the session keep on sending
> queries to node 0 and/or 1, but new session is created. This is much
> harder than case 1. There are multiple places where session
> disconnection could occur.
>
> - accepting new connection from client. In wait_for_new_connections,
> call wait_for_failover_to_finish to wait for completion of
> failover.
>
> - creating new connection to backend. After accepting connection
> request from client and before creating connection to backend, call
> wait_for_failover_to_finish to wait for completion of failover.
>
> - fixing broken socket. pool_get_cp checks whether exiting backend
> connection is broken. If it's broken, sleep 1 second to expect
> failover happens then calls wait_for_failover_to_finish().
>
> - processing an application name. If an application name is included
> in a startup message, pgpool sends query like "SET application_name
> TO foo" to all backend nodes including node 2, which could cause a
> write error. To prevent the error, I modified
> connect_using_existing_connection, which is sending the SET command
> using do_command, so that do_command does not raise an ERROR by
> wrapping it in TRY/CATCH block.
>
> Note that even with all fixes above, I was not able to fix some cases
> where pool_write raises error. pool_write is used to write to backend
> socket and there are too many places to fix all of them. For now we
> need to run "pcp_detach_node 2" before shutdown it. pcp_detach_node
> will tell all pgpool child process that node 2 is going down. Even if
> a child process does not notice it and writes to backend 2 socket,
> there will be no error because node 2 is still alive.
>
> Tests: I have created new test 079.failover_session for this
> feature. For the test pgbench is used. I can generate load using
> continuous session (without -C option) and repeating
> connection/disconnection (with -C option). There are 4 causes in the test:
>
> "=== test1: backend_weight2 = 0 and pgbench without -C option"
> "=== test2: backend_weight2 = 0 and pgbench with -C option"
> "=== test3: load_balance_mode = off and pgbench without -C option"
> "=== test4: load_balance_mode = off and pgbench with -C option"
>
> test2 and test4 requires pcp_detach_node before shutting down node 2.
>
> Patch:
>
> Attached is the v1 patch for this feature. Comments and suggestions
> are welcome.
Patch pushed to master branch.
Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
More information about the pgpool-hackers
mailing list