[pgpool-general: 8315] Degenerate cluster
Jon SCHEWE
jon.schewe at raytheon.com
Fri Jul 8 03:34:19 JST 2022
Is it possible to have pgpool keep running with a single node in the cluster?
I have 3 nodes in my cluster and I want to upgrade postgresql on them with minimal downtime.
I figured I would shutdown postgresql and pgpool on 2 of them and leave the 3rd one up until the others are upgraded and ready.
However when I stopped pgpool on 2 nodes the 3rd node (psql-dev-03) gave up the virtual IP and I have no database connection.
Is this expected?
pgpool logs from psql-dev-03 are below.
2022-07-07 13:10:08.210: watchdog pid 338248: LOG: remote node "psql-dev-01:9898 Linux psql-dev-01" is shutting down
2022-07-07 13:10:08.211: watchdog pid 338248: LOG: watchdog cluster has lost the coordinator node
2022-07-07 13:10:08.211: watchdog pid 338248: LOG: removing the remote node "psql-dev-01:9898 Linux psql-dev-01" from watchdog cluster leader
2022-07-07 13:10:08.211: watchdog pid 338248: LOG: We have lost the cluster leader node "psql-dev-01:9898 Linux psql-dev-01"
2022-07-07 13:10:08.211: watchdog pid 338248: LOG: watchdog node state changed from [STANDBY] to [JOINING]
2022-07-07 13:10:08.211: watchdog pid 338248: LOG: watchdog node state changed from [JOINING] to [INITIALIZING]
2022-07-07 13:10:09.213: watchdog pid 338248: LOG: watchdog node state changed from [INITIALIZING] to [STANDING FOR LEADER]
2022-07-07 13:10:09.213: watchdog pid 338248: LOG: watchdog node state changed from [STANDING FOR LEADER] to [LEADER]
2022-07-07 13:10:09.213: watchdog pid 338248: LOG: I am announcing my self as leader/coordinator watchdog node
2022-07-07 13:10:09.214: watchdog pid 338248: LOG: I am the cluster leader node
2022-07-07 13:10:09.214: watchdog pid 338248: DETAIL: our declare coordinator message is accepted by all nodes
2022-07-07 13:10:09.214: watchdog pid 338248: LOG: setting the local node "psql-dev-03:9898 Linux psql-dev-03" as watchdog cluster leader
2022-07-07 13:10:09.214: watchdog pid 338248: LOG: signal_user1_to_parent_with_reason(1)
2022-07-07 13:10:09.214: watchdog pid 338248: LOG: I am the cluster leader node but we do not have enough nodes in cluster
2022-07-07 13:10:09.214: watchdog pid 338248: DETAIL: waiting for the quorum to start escalation process
2022-07-07 13:10:09.214: main pid 338225: LOG: Pgpool-II parent process received SIGUSR1
2022-07-07 13:10:09.214: main pid 338225: LOG: Pgpool-II parent process received watchdog state change signal from watchdog
2022-07-07 13:10:09.214: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:10:10.216: watchdog pid 338248: LOG: adding watchdog node "psql-dev-02:9898 Linux psql-dev-02" to the standby list
2022-07-07 13:10:10.216: watchdog pid 338248: LOG: quorum found
2022-07-07 13:10:10.216: watchdog pid 338248: DETAIL: starting escalation process
2022-07-07 13:10:10.217: watchdog pid 338248: LOG: escalation process started with PID:937636
2022-07-07 13:10:10.217: watchdog pid 338248: LOG: signal_user1_to_parent_with_reason(3)
2022-07-07 13:10:10.217: main pid 338225: LOG: Pgpool-II parent process received SIGUSR1
2022-07-07 13:10:10.217: watchdog_utility pid 937636: LOG: watchdog: escalation started
2022-07-07 13:10:10.217: main pid 338225: LOG: Pgpool-II parent process received watchdog quorum change signal from watchdog
2022-07-07 13:10:10.218: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:10:10.218: main pid 338225: LOG: watchdog cluster now holds the quorum
2022-07-07 13:10:10.218: main pid 338225: DETAIL: updating the state of quarantine backend nodes
2022-07-07 13:10:10.218: watchdog pid 338248: LOG: new IPC connection received
+ PGPOOLS=(psql-dev-01 psql-dev-02 psql-dev-03)
+ VIP=XXX.XXX.XXX.60
+ DEVICE=ens192
+ for pgpool in "${PGPOOLS[@]}"
+ '[' psql-dev-03 = psql-dev-01 ']'
+ ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null postgres at psql-dev-01 -i /var/lib/pgsql/.ssh/id_rsa_pgpool '
/usr/bin/sudo /sbin/ip addr del XXX.XXX.XXX.60/24 dev ens192
'
Warning: Permanently added 'psql-dev-01,XXX.XXX.XXX.61' (ECDSA) to the list of known hosts.
RTNETLINK answers: Cannot assign requested address
+ for pgpool in "${PGPOOLS[@]}"
+ '[' psql-dev-03 = psql-dev-02 ']'
+ ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null postgres at psql-dev-02 -i /var/lib/pgsql/.ssh/id_rsa_pgpool '
/usr/bin/sudo /sbin/ip addr del XXX.XXX.XXX.60/24 dev ens192
'
Warning: Permanently added 'psql-dev-02,XXX.XXX.XXX.62' (ECDSA) to the list of known hosts.
RTNETLINK answers: Cannot assign requested address
+ for pgpool in "${PGPOOLS[@]}"
+ '[' psql-dev-03 = psql-dev-03 ']'
+ ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null postgres at psql-dev-03 -i /var/lib/pgsql/.ssh/id_rsa_pgpool '
/usr/bin/sudo /sbin/ip addr del XXX.XXX.XXX.60/24 dev ens192
'
Warning: Permanently added 'psql-dev-03,XXX.XXX.XXX.63' (ECDSA) to the list of known hosts.
RTNETLINK answers: Cannot assign requested address
+ exit 0
2022-07-07 13:10:11.325: watchdog_utility pid 937636: LOG: watchdog escalation successful
2022-07-07 13:10:13.961: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:10:15.636: watchdog_utility pid 937636: LOG: successfully acquired the delegate IP:"XXX.XXX.XXX.60"
2022-07-07 13:10:15.636: watchdog_utility pid 937636: DETAIL: 'if_up_cmd' returned with success
2022-07-07 13:10:15.638: watchdog pid 338248: LOG: watchdog escalation process with pid: 937636 exit with SUCCESS.
2022-07-07 13:10:23.989: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:10:31.733: psql pid 937339: LOG: pool_reuse_block: blockid: 0
2022-07-07 13:10:31.733: psql pid 937339: CONTEXT: while searching system catalog, When relcache is missed
2022-07-07 13:10:34.017: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:10:40.399: life_check pid 338253: LOG: informing the node status change to watchdog
2022-07-07 13:10:40.399: life_check pid 338253: DETAIL: node id :0 status = "NODE DEAD" message:"No heartbeat signal from node"
2022-07-07 13:10:40.399: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:10:40.399: watchdog pid 338248: LOG: received node status change ipc message
2022-07-07 13:10:40.399: watchdog pid 338248: DETAIL: No heartbeat signal from node
2022-07-07 13:10:40.399: watchdog pid 338248: LOG: remote node "psql-dev-01:9898 Linux psql-dev-01" is shutting down
2022-07-07 13:10:44.045: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:10:54.073: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:11:04.102: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:11:07.206: watchdog pid 338248: LOG: remote node "psql-dev-02:9898 Linux psql-dev-02" is shutting down
2022-07-07 13:11:07.206: watchdog pid 338248: LOG: removing watchdog node "psql-dev-02:9898 Linux psql-dev-02" from the standby list
2022-07-07 13:11:07.206: watchdog pid 338248: LOG: We have lost the quorum
2022-07-07 13:11:07.207: watchdog pid 338248: LOG: signal_user1_to_parent_with_reason(3)
2022-07-07 13:11:07.207: main pid 338225: LOG: Pgpool-II parent process received SIGUSR1
2022-07-07 13:11:07.207: main pid 338225: LOG: Pgpool-II parent process received watchdog quorum change signal from watchdog
2022-07-07 13:11:07.207: watchdog_utility pid 937919: LOG: watchdog: de-escalation started
2022-07-07 13:11:07.207: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:11:07.408: watchdog_utility pid 937919: LOG: successfully released the delegate IP:"XXX.XXX.XXX.60"
2022-07-07 13:11:07.408: watchdog_utility pid 937919: DETAIL: 'if_down_cmd' returned with success
2022-07-07 13:11:07.411: watchdog pid 338248: LOG: watchdog de-escalation process with pid: 937919 exit with SUCCESS.
2022-07-07 13:11:14.130: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:11:24.158: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:11:34.186: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:11:40.400: life_check pid 338253: LOG: informing the node status change to watchdog
2022-07-07 13:11:40.400: life_check pid 338253: DETAIL: node id :1 status = "NODE DEAD" message:"No heartbeat signal from node"
2022-07-07 13:11:40.400: watchdog pid 338248: LOG: new IPC connection received
2022-07-07 13:11:40.400: watchdog pid 338248: LOG: received node status change ipc message
2022-07-07 13:11:40.400: watchdog pid 338248: DETAIL: No heartbeat signal from node
2022-07-07 13:11:40.400: watchdog pid 338248: LOG: remote node "psql-dev-02:9898 Linux psql-dev-02" is shutting down
Jon
More information about the pgpool-general
mailing list