[pgpool-hackers: 3376] Re: Behavior of resigned master watchdog node

Thu Aug 8 17:57:49 JST 2019

Hi Ishii-San

:-) Thanks for the confirmation.

Best Regards
Muhammad Usama

On Thu, Aug 8, 2019 at 1:28 PM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> Oops. I should have connected to the master Pgpool-II:
>
> t-ishii$ psql -p 50004 -c "show pool_nodes" test
>  node_id | hostname | port  | status | lb_weight |  role   | select_cnt |
> load_balance_node | replication_delay | replication_state |
> replication_sync_state | last_status_change
>
> ---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>  0       | /tmp     | 51000 | up     | 0.333333  | primary | 0          |
> true              | 0                 |                   |
>         | 2019-08-08 17:26:27
>  1       | /tmp     | 51001 | up     | 0.333333  | standby | 0          |
> false             | 0                 | streaming         | async
>         | 2019-08-08 17:26:27
>  2       | /tmp     | 51002 | up     | 0.333333  | standby | 0          |
> false             | 0                 | streaming         | async
>         | 2019-08-08 17:26:27
> (3 rows)
>
> Now node 0 is primary, works great as I expected.
> Sorry for confusion.
>
> > Hi Usama,
> >
> > Thanks for the commit. Unfortunately the fix did not work for me.
> >
> > psql -p 50000 -c "show pool_nodes" test
> >  node_id | hostname | port  |   status   | lb_weight |  role   |
> select_cnt | load_balance_node | replication_delay | replication_state |
> replication_sync_state | last_status_change
> >
> ---------+----------+-------+------------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> >  0       | /tmp     | 51000 | quarantine | 0.333333  | standby | 0
>     | false             | 0                 |                   |
>               | 2019-08-08 17:09:17
> >  1       | /tmp     | 51001 | up         | 0.333333  | standby | 0
>     | false             | 0                 |                   |
>               | 2019-08-08 17:09:00
> >  2       | /tmp     | 51002 | up         | 0.333333  | standby | 0
>     | true              | 0                 |                   |
>               | 2019-08-08 17:09:00
> > (3 rows)
> >
> > There's no primary PostgreSQL node after node was quarantined.
> > Pgpool log attached.
> >
> > From: Muhammad Usama <m.usama at gmail.com>
> > Subject: Re: Behavior of resigned master watchdog node
> > Date: Thu, 8 Aug 2019 11:21:24 +0500
> > Message-ID: <CAEJvTzUpYZHc=T8FvOB12N=
> gC_Qj961dZEpRZTggsLHz_gJqSQ at mail.gmail.com>
> >
> >> Hi Ishii-San
> >>
> >> Thanks for providing the log files along with the steps to reproduce the
> >> issue. I have pushed the fix for it
> >>
> >>
> https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commitdiff;h=2d7702c6961e8949e97992c23a42c8616af36d84
> >>
> >> Thanks
> >> Best Regards
> >> Muhammad Usama
> >>
> >>
> >> On Fri, Aug 2, 2019 at 4:50 PM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> >>
> >>> > Hi Ishii-San
> >>> >
> >>> > On Fri, Aug 2, 2019 at 10:25 AM Tatsuo Ishii <ishii at sraoss.co.jp>
> wrote:
> >>> >
> >>> >> Hi Usama,
> >>> >>
> >>> >> In your commit:
> >>> >>
> >>> >>
> >>>
> https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=33df0d33df1ce701f07fecaeef5b87a2707c08f2
> >>> >>
> >>> >> "the master watchdog node should resign from master responsibilities
> >>> >> if the primary backend node gets into quarantine state on that."
> >>> >>
> >>> >> While testing this feature, I noticed that the master watchdog node
> >>> >> resign from master responsibilities as expected in this case.
> However
> >>> >> none of standby nodes gets promoted to new primary. As a result,
> >>> >> there's no primary node any more after the master watchdog resign.
> >>> >>
> >>> >> This is not very good because users cannot
> >>> >
> >>> >
> >>> > This is not at all expected behavior, Can you please share the logs
> or
> >>> > steps to reproduce the issue.
> >>> > I will look into this on priority.
> >>>
> >>> Sure. Pgpool.logs attached.
> >>>
> >>> Here are steps to reproduce the issue.
> >>>
> >>> - Pgpool-II master branch head with patch [pgpool-hackers: 3361].
> >>>
> >>> 1) create 3 watchdog/3 PostgreSQL cluster: watchdog_setup -wn 3 -n 3
> >>>
> >>> $ psql -p 50000 -c "show pool_nodes" test
> >>>
> >>>  node_id | hostname | port  | status | lb_weight |  role   |
> select_cnt |
> >>> load_balance_node | replication_delay | replication_state |
> >>> replication_sync_state | last_status_change
> >>>
> >>>
> ---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> >>>  0       | /tmp     | 51000 | up     | 0.333333  | primary | 0
>   |
> >>> true              | 0                 |                   |
> >>>         | 2019-08-02 20:38:25
> >>>  1       | /tmp     | 51001 | up     | 0.333333  | standby | 0
>   |
> >>> false             | 0                 | streaming         | async
> >>>         | 2019-08-02 20:38:25
> >>>  2       | /tmp     | 51002 | up     | 0.333333  | standby | 0
>   |
> >>> false             | 0                 | streaming         | async
> >>>         | 2019-08-02 20:38:25
> >>> (3 rows)
> >>>
> >>> $ pcp_watchdog_info -p 50001 -v -w
> >>>
> >>> Watchdog Cluster Information
> >>> Total Nodes          : 3
> >>> Remote Nodes         : 2
> >>> Quorum state         : QUORUM EXIST
> >>> Alive Remote Nodes   : 2
> >>> VIP up on local node : YES
> >>> Master Node Name     : localhost:50000 Linux tishii-CFSV7-1
> >>> Master Host Name     : localhost
> >>>
> >>> Watchdog Node Information
> >>> Node Name      : localhost:50000 Linux tishii-CFSV7-1
> >>> Host Name      : localhost
> >>> Delegate IP    : Not_Set
> >>> Pgpool port    : 50000
> >>> Watchdog port  : 50002
> >>> Node priority  : 3
> >>> Status         : 4
> >>> Status Name    : MASTER
> >>>
> >>> Node Name      : localhost:50004 Linux tishii-CFSV7-1
> >>> Host Name      : localhost
> >>> Delegate IP    : Not_Set
> >>> Pgpool port    : 50004
> >>> Watchdog port  : 50006
> >>> Node priority  : 2
> >>> Status         : 7
> >>> Status Name    : STANDBY
> >>>
> >>> Node Name      : localhost:50008 Linux tishii-CFSV7-1
> >>> Host Name      : localhost
> >>> Delegate IP    : Not_Set
> >>> Pgpool port    : 50008
> >>> Watchdog port  : 50010
> >>> Node priority  : 1
> >>> Status         : 7
> >>> Status Name    : STANDBY
> >>>
> >>> 2) echo "0      down" > pgpoo0/log
> >>>
> >>> 3) check the result.
> >>>
> >>> $psql -p 50000 -c "show pool_nodes" test
> >>>
> >>>  node_id | hostname | port  | status | lb_weight |  role   |
> select_cnt |
> >>> load_balance_node | replication_delay | replication_state |
> >>> replication_sync_state | last_status_change
> >>>
> >>>
> ---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> >>>  0       | /tmp     | 51000 | up     | 0.333333  | standby | 0
>   |
> >>> true              | 0                 |                   |
> >>>         | 2019-08-02 20:39:25
> >>>  1       | /tmp     | 51001 | up     | 0.333333  | standby | 0
>   |
> >>> false             | 0                 |                   |
> >>>         | 2019-08-02 20:39:25
> >>>  2       | /tmp     | 51002 | up     | 0.333333  | standby | 0
>   |
> >>> false             | 0                 |                   |
> >>>         | 2019-08-02 20:39:25
> >>> (3 rows)
> >>>
> >>> $ pcp_watchdog_info -p 50001 -v -w
> >>> Watchdog Cluster Information
> >>> Total Nodes          : 3
> >>> Remote Nodes         : 2
> >>> Quorum state         : QUORUM EXIST
> >>> Alive Remote Nodes   : 2
> >>> VIP up on local node : NO
> >>> Master Node Name     : localhost:50004 Linux tishii-CFSV7-1
> >>> Master Host Name     : localhost
> >>>
> >>> Watchdog Node Information
> >>> Node Name      : localhost:50000 Linux tishii-CFSV7-1
> >>> Host Name      : localhost
> >>> Delegate IP    : Not_Set
> >>> Pgpool port    : 50000
> >>> Watchdog port  : 50002
> >>> Node priority  : 3
> >>> Status         : 7
> >>> Status Name    : STANDBY
> >>>
> >>> Node Name      : localhost:50004 Linux tishii-CFSV7-1
> >>> Host Name      : localhost
> >>> Delegate IP    : Not_Set
> >>> Pgpool port    : 50004
> >>> Watchdog port  : 50006
> >>> Node priority  : 2
> >>> Status         : 4
> >>> Status Name    : MASTER
> >>>
> >>> Node Name      : localhost:50008 Linux tishii-CFSV7-1
> >>> Host Name      : localhost
> >>> Delegate IP    : Not_Set
> >>> Pgpool port    : 50008
> >>> Watchdog port  : 50010
> >>> Node priority  : 1
> >>> Status         : 7
> >>> Status Name    : STANDBY
> >>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20190808/08166d20/attachment-0001.html>