<div dir="ltr"><div dir="ltr">On Mon, Jan 16, 2023 at 1:33 AM Tatsuo Ishii <<a href="mailto:ishii@sraoss.co.jp">ishii@sraoss.co.jp</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">> We are seeing failures in our test suite on a specific set of tests related<br>

> to taking a node out of a cluster. In short, it seems to following sequence<br>

> of events occurs:<br>

> * We start with a health cluster with 3 nodes (0, 1 and 2), each node<br>

> running pgpool and postgresql. Node 0 runs the primary database.<br>

> * node 1 is shutdown<br>

> * pgpool on node 0 and 2 correctly mark backend 1 down<br>

> * pgpool on node 0 is reconfigured, removing node 1 from the configuration,<br>

> backend 0 remains backend 0, backend 2 is now known as backend 1<br>

> * pgpool on node 0 starts up again, and receives the cluster status from<br>

> node 2, which includes backend 1 being down.<br>

> * pgpool on node 0 now also marks backend 1 as being down, but because of<br>

> the renumbering, it actually marks the backend on node 2 as down<br>

> * pgpool on node 2 gets its new configuration, same as on node 0<br>

> * pgpool on node 2 (which is now runs backend 1) gets the cluster status<br>

> from node 0, and marks backend 1 down<br>

> * the cluster ends up with pgpool and postgresql running on both remaining<br>

> nodes, but backend 1 is down. It never recovers from this state<br>

> automatically, even though auto_failback is enabled and postgresql is up<br>

> and streaming.<br>

> <br>

> For node 2 (with backend 1), pcp_node_info returns the following<br>

> information for backend 1:<br>

> Hostname               : 172.29.30.3<br>

> Port                   : 5432<br>

> Status                 : 3<br>

> Weight                 : 0.500000<br>

> Status Name            : down<br>

> Backend Status Name    : up<br>

> Role                   : standby<br>

> Backend Role           : standby<br>

> Replication Delay      : 0<br>

> Replication State      : streaming<br>

> Replication Sync State : async<br>

> Last Status Change     : 2023-01-09 22:28:41<br>

> <br>

> My first question is: Can we somehow prevent the state of backend 1 being<br>

> assigned to the wrong node during the configuration update?<br>

<br>

Have you removed pgpool_status file before restarting pgpool?  The<br>

file remembers the backend status along with node id hence you need to<br>

update the file. If the file does not exist upon pgpol startup, it<br>

will be automatically created.<br></blockquote><div><br></div><div>Yes, we remove the status file when we change the configuration of pgpool. From what we can see in the logs, the backend is set to down after synching the status in the cluster. Are backends identified by their index in the cluster? After node 0 gets its new configuration, its backend1 will point to node 2, while on node 2, backend1 still points to the former node 1. It seems like this causes the backends to get mixed up and the wrong one is marked down.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> My second question: Why does the auto_failback not reattach backend 1 when<br>

> it detects the database is up and streaming?<br>

<br>

Maybe because of this?<br>

<br>

<a href="https://www.pgpool.net/docs/44/en/html/runtime-config-failover.html#RUNTIME-CONFIG-FAILOVER-SETTINGS" rel="noreferrer" target="_blank">https://www.pgpool.net/docs/44/en/html/runtime-config-failover.html#RUNTIME-CONFIG-FAILOVER-SETTINGS</a><br>

<br>

> Note: auto_failback may not work, when replication slot is used. There<br>

> is possibility that the streaming replication is stopped, because<br>

> failover_command is executed and replication slot is deleted by the<br>

> command.<br></blockquote><div> </div><div>We do not use replication slots, at least we do not create them manually. But in this scenario we also don't perform a failover. The primary database runs on node 0 and is never taken offline. It's the standby database on node 1 that is taken offline. The backend1 (backend on node 2) that is marked down, also isn't touched. In the database logs, I can see that the databases are running and never lost connection.</div><div><br></div><div>Best regards,</div><div>Emond</div></div></div>