[pgpool-general: 7682] Re: Failover question
Wolf Schwurack
wolf at uen.org
Thu Sep 2 02:34:32 JST 2021
I found the issue /bin/ip did not have the sticky set on node3. After setting the sticky bit on /bin/ip the failover on node3 is working
root at pgtest-03:~# ll /bin/ip
-rwxr-xr-x 1 root root 611960 Feb 13 2020 /bin/ip*
root at pgtest-03:~# chmod u=rwxs,g=rx,o=rx /bin/ip
root at pgtest-03:~# ll /bin/ip
-rwsr-xr-x 1 root root 611960 Feb 13 2020 /bin/ip*
On 9/1/21, 10:54 AM, "pgpool-general on behalf of Wolf Schwurack" <pgpool-general-bounces at pgpool.net on behalf of wolf at uen.org> wrote:
Hello
This shows pcp_watchdog_info after node1 is added back in
postgres at pgtest-02:~$ pcp_watchdog_info -h localhost -U wolf
Password:
3 YES pgtest-02:9999 Linux pgtest-02 pgtest-02
pgtest-02:9999 Linux pgtest-02 pgtest-02 9999 9000 4 LEADER
pgtest-01:9999 Linux pgtest-01 pgtest-01 9999 9000 7 STANDBY
pgtest-03:9999 Linux pgtest-03 pgtest-03 9999 9000 7 STANDBY
Here's the pgpool.log from node3 after shutdown of pgpool on node2
2021-09-01 10:43:38: pid 417478: LOG: adding watchdog node "pgtest-01:9999 Linux pgtest-01" to the standby list
2021-09-01 10:43:38: pid 417478: LOG: quorum found
2021-09-01 10:43:38: pid 417478: DETAIL: starting escalation process
2021-09-01 10:43:38: pid 417478: LOG: escalation process started with PID:554759
2021-09-01 10:43:38: pid 417478: LOG: signal_user1_to_parent_with_reason(3)
2021-09-01 10:43:38: pid 417474: LOG: Pgpool-II parent process received SIGUSR1
2021-09-01 10:43:38: pid 417474: LOG: Pgpool-II parent process received watchdog quorum change signal from watchdog
2021-09-01 10:43:38: pid 417478: LOG: new IPC connection received
2021-09-01 10:43:38: pid 417474: LOG: watchdog cluster now holds the quorum
2021-09-01 10:43:38: pid 417474: DETAIL: updating the state of quarantine backend nodes
2021-09-01 10:43:38: pid 417478: LOG: new IPC connection received
2021-09-01 10:43:38: pid 554759: LOG: watchdog: escalation started
RTNETLINK answers: Operation not permitted
2021-09-01 10:43:38: pid 554759: LOG: failed to acquire the delegate IP address
2021-09-01 10:43:38: pid 554759: DETAIL: 'if_up_cmd' failed
2021-09-01 10:43:38: pid 554759: WARNING: watchdog escalation failed to acquire delegate IP
Here's pcp_watchdog_info on node3 after showdown of pgpool on node2
postgres at pgtest-03:~$ pcp_watchdog_info -h localhost -U wolf
Password:
3 YES pgtest-03:9999 Linux pgtest-03 pgtest-03
pgtest-03:9999 Linux pgtest-03 pgtest-03 9999 9000 4 LEADER
pgtest-01:9999 Linux pgtest-01 pgtest-01 9999 9000 7 STANDBY
pgtest-02:9999 Linux pgtest-02 pgtest-02 9999 9000 10 SHUTDOWN
Here's pcp_watchdog_info on node3 after start of pgpool on node2
postgres at pgtest-03:~$ pcp_watchdog_info -h localhost -U wolf
Password:
3 YES pgtest-03:9999 Linux pgtest-03 pgtest-03
pgtest-03:9999 Linux pgtest-03 pgtest-03 9999 9000 4 LEADER
pgtest-01:9999 Linux pgtest-01 pgtest-01 9999 9000 7 STANDBY
pgtest-02:9999 Linux pgtest-02 pgtest-02 9999 9000 7 STANDBY
Still no watchdog IP enabled It seems this is the issue on node3 maybe a permission issue?
RTNETLINK answers: Operation not permitted
Wolf
On 8/29/21, 9:18 PM, "Bo Peng" <pengbo at sraoss.co.jp> wrote:
Hello,
> Sorry but you miss the part where node 1 was added back to a standby after the failover to node 2. At the point of when I turn off pgpool on node 2, node 1 and node 3 are the standby nodes which node 3 should take over watchdog
I have tested Pgpool-II 4.2.4, but I could not reproduce this issue.
Could you share the following information?
- result of "pcp_watchdog_info" after adding back node1 as a standby
- pgpool logs of node 1 and node 3 after turning off pgpool on node2.
> Wolf
>
> On 8/27/21, 10:07 AM, "Bo Peng" <pengbo at sraoss.co.jp> wrote:
>
> Hello,
>
> > My question is why watchdog doesn’t come up on node 3. Pgpool.conf is set the same on all 3 nodes.
>
> If you shut down pgpool node1 and node2, the number of alive pgpool is one,
> the quorum does not exist.
>
> If you want to enable watchdog even if the quorum does not exist,
> you need to enable the parameter "enable_consensus_with_half_votes".
>
> See more detail about "enable_consensus_with_half_votes":
> https://www.pgpool.net/docs/latest/en/html/runtime-watchdog-config.html#GUC-ENABLE-CONSENSUS-WITH-HALF-VOTES
>
> > I have a 3 nodes setup for pgpool/postgresql using watchdog, When testing the failover of pgpool, I turn off pgpool on node 1 which fails over watchdog to node 2. Then I turn on pgpool on node 1 that set node 1 as a standby node. The next step I turn off pgpool on node 2 which watchdog try’s to fail over to node 3 but watchdog IP never comes up on node 3 or any of the nodes. So I turn off pgpool on node 3 and watchdog fails over to node 1.
> > My question is why watchdog doesn’t come up on node 3. Pgpool.conf is set the same on all 3 nodes.
> >
> > Here’s my output of show pool_nodes
> >
> > node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
> >
> > ---------+-----------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> >
> > 0 | pgtest-01 | 5432 | up | 0.500000 | primary | 2003 | true | 0 | | | 2021-08-24 14:06:20
> >
> > 1 | pgtest-02 | 5432 | up | 0.500000 | standby | 667 | false | 0 | streaming | async | 2021-08-24 14:06:20
> >
> > 2 | pgtest-03 | 5432 | up | 0.000000 | standby | 0 | false | 0 | streaming | async | 2021-08-24 14:06:20
> >
> > Not sure if this is an issue but the lb_weight show node 1(pgtest-01) and node 2(pgtest-02) as 0.5000 and node 3(pgtest-03) as 0.0000
> >
> > In pgpool.conf I have backend_weight for each node set to 0.3
> >
> > Hosts = Ubuntu 20.4
> > Pgpool = 4.2.4
> > PostgreSQL = 12.8
> >
> >
> > -- Wolf
> >
> >
>
>
> --
> Bo Peng <pengbo at sraoss.co.jp>
> SRA OSS, Inc. Japan
> http://www.sraoss.co.jp/
>
--
Bo Peng <pengbo at sraoss.co.jp>
SRA OSS, Inc. Japan
http://www.sraoss.co.jp/
_______________________________________________
pgpool-general mailing list
pgpool-general at pgpool.net
http://www.pgpool.net/mailman/listinfo/pgpool-general
More information about the pgpool-general
mailing list