[pgpool-hackers: 3374] Re: Behavior of resigned master watchdog node

Tatsuo Ishii ishii at sraoss.co.jp
Thu Aug 8 17:16:07 JST 2019


Hi Usama,

Thanks for the commit. Unfortunately the fix did not work for me.

psql -p 50000 -c "show pool_nodes" test
 node_id | hostname | port  |   status   | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
---------+----------+-------+------------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | /tmp     | 51000 | quarantine | 0.333333  | standby | 0          | false             | 0                 |                   |                        | 2019-08-08 17:09:17
 1       | /tmp     | 51001 | up         | 0.333333  | standby | 0          | false             | 0                 |                   |                        | 2019-08-08 17:09:00
 2       | /tmp     | 51002 | up         | 0.333333  | standby | 0          | true              | 0                 |                   |                        | 2019-08-08 17:09:00
(3 rows)

There's no primary PostgreSQL node after node was quarantined.
Pgpool log attached.

From: Muhammad Usama <m.usama at gmail.com>
Subject: Re: Behavior of resigned master watchdog node
Date: Thu, 8 Aug 2019 11:21:24 +0500
Message-ID: <CAEJvTzUpYZHc=T8FvOB12N=gC_Qj961dZEpRZTggsLHz_gJqSQ at mail.gmail.com>

> Hi Ishii-San
> 
> Thanks for providing the log files along with the steps to reproduce the
> issue. I have pushed the fix for it
> 
> https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commitdiff;h=2d7702c6961e8949e97992c23a42c8616af36d84
> 
> Thanks
> Best Regards
> Muhammad Usama
> 
> 
> On Fri, Aug 2, 2019 at 4:50 PM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> 
>> > Hi Ishii-San
>> >
>> > On Fri, Aug 2, 2019 at 10:25 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>> >
>> >> Hi Usama,
>> >>
>> >> In your commit:
>> >>
>> >>
>> https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=33df0d33df1ce701f07fecaeef5b87a2707c08f2
>> >>
>> >> "the master watchdog node should resign from master responsibilities
>> >> if the primary backend node gets into quarantine state on that."
>> >>
>> >> While testing this feature, I noticed that the master watchdog node
>> >> resign from master responsibilities as expected in this case. However
>> >> none of standby nodes gets promoted to new primary. As a result,
>> >> there's no primary node any more after the master watchdog resign.
>> >>
>> >> This is not very good because users cannot
>> >
>> >
>> > This is not at all expected behavior, Can you please share the logs or
>> > steps to reproduce the issue.
>> > I will look into this on priority.
>>
>> Sure. Pgpool.logs attached.
>>
>> Here are steps to reproduce the issue.
>>
>> - Pgpool-II master branch head with patch [pgpool-hackers: 3361].
>>
>> 1) create 3 watchdog/3 PostgreSQL cluster: watchdog_setup -wn 3 -n 3
>>
>> $ psql -p 50000 -c "show pool_nodes" test
>>
>>  node_id | hostname | port  | status | lb_weight |  role   | select_cnt |
>> load_balance_node | replication_delay | replication_state |
>> replication_sync_state | last_status_change
>>
>> ---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>>  0       | /tmp     | 51000 | up     | 0.333333  | primary | 0          |
>> true              | 0                 |                   |
>>         | 2019-08-02 20:38:25
>>  1       | /tmp     | 51001 | up     | 0.333333  | standby | 0          |
>> false             | 0                 | streaming         | async
>>         | 2019-08-02 20:38:25
>>  2       | /tmp     | 51002 | up     | 0.333333  | standby | 0          |
>> false             | 0                 | streaming         | async
>>         | 2019-08-02 20:38:25
>> (3 rows)
>>
>> $ pcp_watchdog_info -p 50001 -v -w
>>
>> Watchdog Cluster Information
>> Total Nodes          : 3
>> Remote Nodes         : 2
>> Quorum state         : QUORUM EXIST
>> Alive Remote Nodes   : 2
>> VIP up on local node : YES
>> Master Node Name     : localhost:50000 Linux tishii-CFSV7-1
>> Master Host Name     : localhost
>>
>> Watchdog Node Information
>> Node Name      : localhost:50000 Linux tishii-CFSV7-1
>> Host Name      : localhost
>> Delegate IP    : Not_Set
>> Pgpool port    : 50000
>> Watchdog port  : 50002
>> Node priority  : 3
>> Status         : 4
>> Status Name    : MASTER
>>
>> Node Name      : localhost:50004 Linux tishii-CFSV7-1
>> Host Name      : localhost
>> Delegate IP    : Not_Set
>> Pgpool port    : 50004
>> Watchdog port  : 50006
>> Node priority  : 2
>> Status         : 7
>> Status Name    : STANDBY
>>
>> Node Name      : localhost:50008 Linux tishii-CFSV7-1
>> Host Name      : localhost
>> Delegate IP    : Not_Set
>> Pgpool port    : 50008
>> Watchdog port  : 50010
>> Node priority  : 1
>> Status         : 7
>> Status Name    : STANDBY
>>
>> 2) echo "0      down" > pgpoo0/log
>>
>> 3) check the result.
>>
>> $psql -p 50000 -c "show pool_nodes" test
>>
>>  node_id | hostname | port  | status | lb_weight |  role   | select_cnt |
>> load_balance_node | replication_delay | replication_state |
>> replication_sync_state | last_status_change
>>
>> ---------+----------+-------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>>  0       | /tmp     | 51000 | up     | 0.333333  | standby | 0          |
>> true              | 0                 |                   |
>>         | 2019-08-02 20:39:25
>>  1       | /tmp     | 51001 | up     | 0.333333  | standby | 0          |
>> false             | 0                 |                   |
>>         | 2019-08-02 20:39:25
>>  2       | /tmp     | 51002 | up     | 0.333333  | standby | 0          |
>> false             | 0                 |                   |
>>         | 2019-08-02 20:39:25
>> (3 rows)
>>
>> $ pcp_watchdog_info -p 50001 -v -w
>> Watchdog Cluster Information
>> Total Nodes          : 3
>> Remote Nodes         : 2
>> Quorum state         : QUORUM EXIST
>> Alive Remote Nodes   : 2
>> VIP up on local node : NO
>> Master Node Name     : localhost:50004 Linux tishii-CFSV7-1
>> Master Host Name     : localhost
>>
>> Watchdog Node Information
>> Node Name      : localhost:50000 Linux tishii-CFSV7-1
>> Host Name      : localhost
>> Delegate IP    : Not_Set
>> Pgpool port    : 50000
>> Watchdog port  : 50002
>> Node priority  : 3
>> Status         : 7
>> Status Name    : STANDBY
>>
>> Node Name      : localhost:50004 Linux tishii-CFSV7-1
>> Host Name      : localhost
>> Delegate IP    : Not_Set
>> Pgpool port    : 50004
>> Watchdog port  : 50006
>> Node priority  : 2
>> Status         : 4
>> Status Name    : MASTER
>>
>> Node Name      : localhost:50008 Linux tishii-CFSV7-1
>> Host Name      : localhost
>> Delegate IP    : Not_Set
>> Pgpool port    : 50008
>> Watchdog port  : 50010
>> Node priority  : 1
>> Status         : 7
>> Status Name    : STANDBY
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: log-1.tar.gz
Type: application/octet-stream
Size: 11091 bytes
Desc: not available
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20190808/97b9bc36/attachment-0001.obj>


More information about the pgpool-hackers mailing list