[pgpool-hackers: 4544] PGPool4.2 changing leader role and healthcheck
Igor Yurchenko
harry.urcen at gmail.com
Tue Dec 3 02:26:23 JST 2024
Hi guys
Need you hints on some weird behaviors of PGPool 4.2.
1. I have 2 pgpool instances that watch each other and handling pgpool VIP.
I see that when a current pgpool leader comes down, the role switched and
VIP moved with significant delay. In logs I see a this picture:
2024-12-02 14:40:12: pid 1286: LOG: watchdog node state changed from
[INITIALIZING] to [LEADER]
2024-12-02 14:40:12: pid 1286: LOG: Setting failover command timeout to 1
2024-12-02 14:40:12: pid 1286: LOG: I am announcing my self as
leader/coordinator watchdog node
2024-12-02 14:40:16: pid 1286: LOG: I am the cluster leader node
2024-12-02 14:40:16: pid 1286: DETAIL: our declare coordinator message is
accepted by all nodes
2024-12-02 14:40:16: pid 1286: LOG: setting the local node "
10.65.188.56:9999 Linux pg-mgrdb2" as watchdog cluster leader
2024-12-02 14:40:16: pid 1286: LOG: signal_user1_to_parent_with_reason(1)
2024-12-02 14:40:16: pid 1286: LOG: I am the cluster leader node. Starting
escalation process
2024-12-02 14:40:16: pid 1281: LOG: Pgpool-II parent process received
SIGUSR1
2024-12-02 14:40:16: pid 1281: LOG: Pgpool-II parent process received
watchdog state change signal from watchdog
2024-12-02 14:40:16: pid 1286: LOG: escalation process started with
PID:4855
2024-12-02 14:40:16: pid 4855: LOG: watchdog: escalation started
2024-12-02 14:40:20: pid 4855: LOG: successfully acquired the delegate
IP:"10.65.188.59"
2024-12-02 14:40:20: pid 4855: DETAIL: 'if_up_cmd' returned with success
2024-12-02 14:40:20: pid 1286: LOG: watchdog escalation process with pid:
4855 exit with SUCCESS.
It has siginficant delays at 14:40:12 and on acquiring the VIP at 14:40:16.
The quorum settings in gpgool.conf are:
failover_when_quorum_exists=off
failover_require_consensus=on
allow_multiple_failover_requests_from_node=off
So I nave no idea why it happens.
2. The second question is about a health check logics. I get right that if
a backend comes to down state, his health check gets stopped?
If yes, how can I ensure that a failed backend comes back (after hardware
issue for example), and should be recovered?
Or it's impossible within pgpool and I should use third-party gears for
tracking backends and triggering the recovering?
BR
Igor Yurchenko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-hackers/attachments/20241202/897938a6/attachment.htm>
More information about the pgpool-hackers
mailing list