[pgpool-hackers: 3373] Re: Behavior of resigned master watchdog node

Thu Aug 8 15:21:24 JST 2019

Hi Ishii-San

Thanks for providing the log files along with the steps to reproduce the
issue. I have pushed the fix for it

https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commitdiff;h=2d7702c6961e8949e97992c23a42c8616af36d84

Thanks
Best Regards
Muhammad Usama

On Fri, Aug 2, 2019 at 4:50 PM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> > Hi Ishii-San
> >
> > On Fri, Aug 2, 2019 at 10:25 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> >
> >> Hi Usama,
> >>
> >> In your commit:
> >>
> >>
> https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=33df0d33df1ce701f07fecaeef5b87a2707c08f2
> >>
> >> "the master watchdog node should resign from master responsibilities
> >> if the primary backend node gets into quarantine state on that."
> >>
> >> While testing this feature, I noticed that the master watchdog node
> >> resign from master responsibilities as expected in this case. However
> >> none of standby nodes gets promoted to new primary. As a result,
> >> there's no primary node any more after the master watchdog resign.
> >>
> >> This is not very good because users cannot
> >
> >
> > This is not at all expected behavior, Can you please share the logs or
> > steps to reproduce the issue.
> > I will look into this on priority.
>
> Sure. Pgpool.logs attached.
>
> Here are steps to reproduce the issue.
>
> - Pgpool-II master branch head with patch [pgpool-hackers: 3361].
>
> 1) create 3 watchdog/3 PostgreSQL cluster: watchdog_setup -wn 3 -n 3
>
> $ psql -p 50000 -c "show pool_nodes" test
>
>  node_id | hostname | port  | status | lb_weight |  role   | select_cnt |
> load_balance_node | replication_delay | replication_state |
> replication_sync_state | last_status_change
>
> ---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>  0       | /tmp     | 51000 | up     | 0.333333  | primary | 0          |
> true              | 0                 |                   |
>         | 2019-08-02 20:38:25
>  1       | /tmp     | 51001 | up     | 0.333333  | standby | 0          |
> false             | 0                 | streaming         | async
>         | 2019-08-02 20:38:25
>  2       | /tmp     | 51002 | up     | 0.333333  | standby | 0          |
> false             | 0                 | streaming         | async
>         | 2019-08-02 20:38:25
> (3 rows)
>
> $ pcp_watchdog_info -p 50001 -v -w
>
> Watchdog Cluster Information
> Total Nodes          : 3
> Remote Nodes         : 2
> Quorum state         : QUORUM EXIST
> Alive Remote Nodes   : 2
> VIP up on local node : YES
> Master Node Name     : localhost:50000 Linux tishii-CFSV7-1
> Master Host Name     : localhost
>
> Watchdog Node Information
> Node Name      : localhost:50000 Linux tishii-CFSV7-1
> Host Name      : localhost
> Delegate IP    : Not_Set
> Pgpool port    : 50000
> Watchdog port  : 50002
> Node priority  : 3
> Status         : 4
> Status Name    : MASTER
>
> Node Name      : localhost:50004 Linux tishii-CFSV7-1
> Host Name      : localhost
> Delegate IP    : Not_Set
> Pgpool port    : 50004
> Watchdog port  : 50006
> Node priority  : 2
> Status         : 7
> Status Name    : STANDBY
>
> Node Name      : localhost:50008 Linux tishii-CFSV7-1
> Host Name      : localhost
> Delegate IP    : Not_Set
> Pgpool port    : 50008
> Watchdog port  : 50010
> Node priority  : 1
> Status         : 7
> Status Name    : STANDBY
>
> 2) echo "0      down" > pgpoo0/log
>
> 3) check the result.
>
> $psql -p 50000 -c "show pool_nodes" test
>
>  node_id | hostname | port  | status | lb_weight |  role   | select_cnt |
> load_balance_node | replication_delay | replication_state |
> replication_sync_state | last_status_change
>
> ---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>  0       | /tmp     | 51000 | up     | 0.333333  | standby | 0          |
> true              | 0                 |                   |
>         | 2019-08-02 20:39:25
>  1       | /tmp     | 51001 | up     | 0.333333  | standby | 0          |
> false             | 0                 |                   |
>         | 2019-08-02 20:39:25
>  2       | /tmp     | 51002 | up     | 0.333333  | standby | 0          |
> false             | 0                 |                   |
>         | 2019-08-02 20:39:25
> (3 rows)
>
> $ pcp_watchdog_info -p 50001 -v -w
> Watchdog Cluster Information
> Total Nodes          : 3
> Remote Nodes         : 2
> Quorum state         : QUORUM EXIST
> Alive Remote Nodes   : 2
> VIP up on local node : NO
> Master Node Name     : localhost:50004 Linux tishii-CFSV7-1
> Master Host Name     : localhost
>
> Watchdog Node Information
> Node Name      : localhost:50000 Linux tishii-CFSV7-1
> Host Name      : localhost
> Delegate IP    : Not_Set
> Pgpool port    : 50000
> Watchdog port  : 50002
> Node priority  : 3
> Status         : 7
> Status Name    : STANDBY
>
> Node Name      : localhost:50004 Linux tishii-CFSV7-1
> Host Name      : localhost
> Delegate IP    : Not_Set
> Pgpool port    : 50004
> Watchdog port  : 50006
> Node priority  : 2
> Status         : 4
> Status Name    : MASTER
>
> Node Name      : localhost:50008 Linux tishii-CFSV7-1
> Host Name      : localhost
> Delegate IP    : Not_Set
> Pgpool port    : 50008
> Watchdog port  : 50010
> Node priority  : 1
> Status         : 7
> Status Name    : STANDBY
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20190808/1c3523b9/attachment.html>