[pgpool-hackers: 3776] Re: Proposal: Simplify WATCHDOG host configurations.
Bo Peng
pengbo at sraoss.co.jp
Tue Aug 11 11:03:58 JST 2020
Hi Usama,
On Thu, 6 Aug 2020 12:36:39 +0500
Muhammad Usama <m.usama at gmail.com> wrote:
> On Thu, Aug 6, 2020 at 11:24 AM Bo Peng <pengbo at sraoss.co.jp> wrote:
>
> > Hi Usama,
> >
> > On Wed, 29 Jul 2020 15:14:16 +0500
> > Muhammad Usama <m.usama at gmail.com> wrote:
> >
> > > On Wed, Jul 29, 2020 at 1:42 PM Bo Peng <pengbo at sraoss.co.jp> wrote:
> > >
> > > > Hi Ishii-san,
> > > >
> > > > Thank you for your comments.
> > > > I have fixed warnings and removed Makefile.in.
> > > > Modified patch is attached.
> > > >
> > > > Usama,
> > > >
> > > > Do you have any comments?
> > > >
> > >
> > > First of all thanks for the good work. I think this patch will
> > > drastically improve the watchdog setup experience.
> > > Overall the patch looks good while I have a couple of comments that can
> > > also be taken care
> > > of as a separate patch after committing this one.
> > >
> > > -- You have rightly removed wd_port key from config JSON (in
> > > get_pool_config_from_json() and get_pool_config_json() functions)
> > > So I think this will cause incompatibility with the older versions. So we
> > > may need to bump the
> > > WD_MESSAGE_DATA_VERSION_MINOR version.
> >
> > To prevent incompatibility I think it's better to leave
> > "wd_port" in POOL_CONFIG struct (pool_config).
> >
> > And also I will leave "wd_remote_nodes" in POOL_CONFIG struct,
> > because it is easy to compare the number of remote pgpool nodes.
> >
>
> I think we can live with minor incompatibility if it is absolutely necessary
> since this feature will be part of the next major release of Pgpool-II.
> So if leaving the wd_port and wd_remote_nodes in POOL_CONFIG struct
> compromises the design in any way then, in my opinion, you should not worry
> about the incompatibility and stick with the current design and just
> increment the WD_MESSAGE_DATA_VERSION_MINOR
Thank you for your comments.
I have committed the patch including the following changes:
- increase WD_MESSAGE_DATA_VERSION_MINOR to "2"
- remove WatchdogNode->private_id and use pgpool_node_id as a cluster-wide unique id for each node
commit:
https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=a840ecb2c0aa15448f7fc63a7b0bb46949a9c24f
As we talked at a previous meeting, are you going to do further work
on "heartbeat_device" paramater?
> Thanks
> Best regards
> Muhammad Usama
>
>
>
> > > -- currently, watchdog uses the WatchdogNode->private_id to identify each
> > > watchdog
> > > node locally while private_id of local node is always set to 0.
> > > And now with this patch, the node number for each watchdog node can be
> > > consistent across the cluster so we can make this private_id go away and
> > > use pgpool_node_id
> > > as a cluster-wide unique id for each node.
> > > For that, I think we can also include pgpool_node_id in the
> > > WD_ADD_NODE_MESSAGE
> > > message and upon receiving WD_ADD_NODE_MESSAGE watchdog must also
> > > verify that multiple pgpool-II nodes should not be using the same
> > > pgpool_node_id.
> >
> > Yes. I agree to use pgpool_node_id.
> > I will add the implementation.
>
>
> > > Thanks
> > > Best regards
> > > Muhammad Usama
> > >
> > >
> > >
> > >
> > > > On Tue, 28 Jul 2020 14:13:14 +0900 (JST)
> > > > Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> > > >
> > > > > Hi Peng,
> > > > >
> > > > > Thank you for the patch (that must be a hard work). I have applied
> > the
> > > > > patch to the master branch head and here are some comments:
> > > > >
> > > > > 1. The patch should not include Makefile.in. In the development stage
> > > > > each developer uses their own environment, thus generated files by
> > > > > autoconf may vary. This prevented me from applying the patch and I
> > had
> > > > > to remove Makefile.in patch part.
> > > > >
> > > > > 2. I saw some warnings from gcc.
> > > > >
> > > > >
> > > >
> > -------------------------------------------------------------------------------------------------------------
> > > > > config/pool_config_variables.c: In function ‘SetHBDestIfFunc’:
> > > > > config/pool_config_variables.c:4945:5: warning: this ‘if’ clause does
> > > > not guard... [-Wmisleading-indentation]
> > > > > if (g_pool_config.wd_lifecheck_method != LIFECHECK_BY_HB)
> > > > > ^~
> > > > > config/pool_config_variables.c:4955:2: note: ...this statement, but
> > the
> > > > latter is misleadingly indented as if it were guarded by the ‘if’
> > > > > for (i = 0; i < WD_MAX_IF_NUM; i++)
> > > > > ^~~
> > > > > pool_config_variables.c: In function ‘SetHBDestIfFunc’:
> > > > > pool_config_variables.c:4945:5: warning: this ‘if’ clause does not
> > > > guard... [-Wmisleading-indentation]
> > > > > if (g_pool_config.wd_lifecheck_method != LIFECHECK_BY_HB)
> > > > > ^~
> > > > > pool_config_variables.c:4955:2: note: ...this statement, but the
> > latter
> > > > is misleadingly indented as if it were guarded by the ‘if’
> > > > > for (i = 0; i < WD_MAX_IF_NUM; i++)
> > > > > ^~~
> > > > > pool_config_variables.c: In function ‘SetHBDestIfFunc’:
> > > > > pool_config_variables.c:4945:5: warning: this ‘if’ clause does not
> > > > guard... [-Wmisleading-indentation]
> > > > > if (g_pool_config.wd_lifecheck_method != LIFECHECK_BY_HB)
> > > > > ^~
> > > > > pool_config_variables.c:4955:2: note: ...this statement, but the
> > latter
> > > > is misleadingly indented as if it were guarded by the ‘if’
> > > > > for (i = 0; i < WD_MAX_IF_NUM; i++)
> > > > > ^~~
> > > > > pool_config_variables.c: In function ‘SetHBDestIfFunc’:
> > > > > pool_config_variables.c:4945:5: warning: this ‘if’ clause does not
> > > > guard... [-Wmisleading-indentation]
> > > > > if (g_pool_config.wd_lifecheck_method != LIFECHECK_BY_HB)
> > > > > ^~
> > > > > pool_config_variables.c:4955:2: note: ...this statement, but the
> > latter
> > > > is misleadingly indented as if it were guarded by the ‘if’
> > > > > for (i = 0; i < WD_MAX_IF_NUM; i++)
> > > > > ^~~
> > > > >
> > > >
> > -------------------------------------------------------------------------------------------------------------
> > > > >
> > > > > 3. I confirmed that standard regression tests were all ok on my
> > > > > Ubuntu18 box.
> > > > >
> > > > > Other than the warnings above and Makefile.in issue, this patch looks
> > > > > in good shape for me.
> > > > >
> > > > > Best regards,
> > > > > --
> > > > > Tatsuo Ishii
> > > > > SRA OSS, Inc. Japan
> > > > > English: http://www.sraoss.co.jp/index_en.php
> > > > > Japanese:http://www.sraoss.co.jp
> > > > >
> > > > > > hello,
> > > > > >
> > > > > > I have completed this patch.
> > > > > > (The configuration example docs need to be updated.)
> > > > > >
> > > > > > This patch can simplify WATCHDOG configurations
> > > > > > by using a common configuration file for each pgpool node,
> > > > > > and users just copy the configuration file to each other node
> > without
> > > > editing.
> > > > > >
> > > > > > =======
> > > > > > Changes
> > > > > > =======
> > > > > >
> > > > > > The current watchdog and heartbeat configuration parameters (for 3
> > > > pgpool nodes)
> > > > > >
> > > > > > ----
> > > > > > wd_hostname
> > > > > > wd_port
> > > > > > wd_heartbeat_port
> > > > > > heartbeat_destination0
> > > > > > heartbeat_destination_port0
> > > > > > heartbeat_destination1
> > > > > > heartbeat_destination_port1
> > > > > > other_pgpool_hostname0
> > > > > > other_pgpool_port0
> > > > > > other_pgpool_hostname1
> > > > > > other_pgpool_port1
> > > > > > ----
> > > > > >
> > > > > > are changed to
> > > > > >
> > > > > > ----
> > > > > > hostname0 = 'server1'
> > > > > > wd_port0 = 9000
> > > > > > pgpool_port0 = 9999
> > > > > >
> > > > > > hostname1 = 'server2'
> > > > > > wd_port1 = 9000
> > > > > > pgpool_port1 = 9999
> > > > > >
> > > > > > hostname2 = 'server3'
> > > > > > wd_port2 = 9000
> > > > > > pgpool_port2 = 9999
> > > > > >
> > > > > > heartbeat_hostname0 = 'server1'
> > > > > > heartbeat_port0 = 9694
> > > > > > heartbeat_device0 = ''
> > > > > >
> > > > > > heartbeat_hostname1 = 'server2'
> > > > > > heartbeat_port1 = 9694
> > > > > > heartbeat_device1 = ''
> > > > > >
> > > > > > heartbeat_hostname2 = 'server3'
> > > > > > heartbeat_port2 = 9694
> > > > > > heartbeat_device2 = ''
> > > > > > ----
> > > > > >
> > > > > > You can specify multiple configurations in
> > > > > > "heartbeat_hostname" and "heartbeat_device" by separating
> > > > > > them using semicolon (;).
> > > > > >
> > > > > > And user needs to specify local pgpool node id in a pgpool_node_id
> > > > file
> > > > > > which is created in the direcroty of pgpool.conf.
> > > > > >
> > > > > > For example:
> > > > > >
> > > > > > (pgpool node 0)
> > > > > > $ cat <directory path of pgpool.conf>/pgpool_node_id
> > > > > > 0
> > > > > >
> > > > > > (pgpool node 1)
> > > > > > $ cat <directory path of pgpool.conf>/pgpool_node_id
> > > > > > 1
> > > > > >
> > > > > > (pgpool node 2)
> > > > > > $ cat <directory path of pgpool.conf>/pgpool_node_id
> > > > > > 2
> > > > > >
> > > > > >
> > > > > > Any feedback about this patch?
> > > > > >
> > > > > >
> > > > > > On Thu, 25 Jun 2020 11:17:13 +0900 (JST)
> > > > > > Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> > > > > >
> > > > > >> > Ishii-san, Usama,
> > > > > >> >
> > > > > >> > Thank you for your advice.
> > > > > >> > I have some questions about heartbeat related parameters.
> > > > > >> >
> > > > > >> > I would like to change the watchdog related parameters as
> > follow:
> > > > > >> >
> > > > > >> > ----------------------
> > > > > >> > wd_hostname0 =
> > > > > >> > pgpool_port0 =
> > > > > >> > wd_port0 =
> > > > > >> >
> > > > > >> > wd_hostname1 =
> > > > > >> > pgpool_port1 =
> > > > > >> > wd_port1 =
> > > > > >> > .
> > > > > >> > .
> > > > > >> > .
> > > > > >> >
> > > > > >> > heartbeat_hostname0 =
> > > > > >> > heartbeat_port0 =
> > > > > >> > heartbeat_device0 =
> > > > > >> >
> > > > > >> > heartbeat_hostname1 =
> > > > > >> > heartbeat_port1 =
> > > > > >> > heartbeat_device1 =
> > > > > >> > ----------------------
> > > > > >> >
> > > > > >> > However, during the implementation, I noticed more than one
> > network
> > > > interface
> > > > > >> > can be specified to send/receive heartbeat signal in Pgpool-II.
> > > > > >> >
> > > > > >> > Because only one "wd_heartbeat_port" can be specified,
> > > > > >> > "heartbeat_destination_portX" should be specified with the same
> > > > port number?
> > > > > >> >
> > > > > >> >
> > > > > >> > For example: 3 pgpool node, heartbeat with 2 NIC
> > > > > >> >
> > > > > >> > Should we configure heartbeat paramaters as below?
> > > > > >> >
> > > > > >> > ---------------
> > > > > >> > wd_heartbeat_port = 9694
> > > > > >> >
> > > > > >> > heartbeat_destination0 = "192.168.37.102"
> > > > > >> > heartbeat_destination1 = '192.168.54.102'
> > > > > >> > heartbeat_destination_port0 = 9694
> > > > > >> > heartbeat_destination_port1 = 9694
> > > > > >> > heartbeat_device0 = 'eth0'
> > > > > >> > heartbeat_device1 = 'eth1'
> > > > > >> >
> > > > > >> > heartbeat_destination2 = '192.168.37.103'
> > > > > >> > heartbeat_destination3 = '192.168.54.103'
> > > > > >> > heartbeat_destination_port2 = 9694
> > > > > >> > heartbeat_destination_port3 = 9694
> > > > > >> > heartbeat_device2 = 'eth0'
> > > > > >> > heartbeat_device3 = 'eth1'
> > > > > >> > ---------------
> > > > > >> >
> > > > > >> >
> > > > > >> > If above is correct, I would like to design the heartbeat
> > > > paramaters as below:
> > > > > >> > If it is the own node settings, "heartbeat_hostname" and
> > > > "heartbeat_device" will be ignored.
> > > > > >> >
> > > > > >> > ---------------
> > > > > >> > heartbeat_hostname0 = '192.168.37.101:192.168.54.101'
> > > > > >> > heartbeat_port0 = 9694
> > > > > >> > heartbeat_device0 = 'eth0:eth1'
> > > > > >> >
> > > > > >> > heartbeat_hostname1 = '192.168.37.102:192.168.54.102'
> > > > > >> > heartbeat_port1 = 9694
> > > > > >> > heartbeat_device1 = 'eth0:eth1'
> > > > > >> >
> > > > > >> > heartbeat_hostname2 = '192.168.37.103:192.168.54.103'
> > > > > >> > heartbeat_port2 = 9694
> > > > > >> > heartbeat_device2 = 'eth0:eth1'
> > > > > >> > ---------------
> > > > > >> >
> > > > > >> > What do you think?
> > > > > >>
> > > > > >> Above looks good to me.
> > > > > >>
> > > > > >> BTW, if the IP addresses are on the same subnet, it is not enough
> > to
> > > > > >> specify heartbeat_device. You also need to tweak Linux kernel
> > > > > >> parameters
> > > > > >> (/proc/sys/net/ipv4/conf/***/{arp_announce,arp_ignore). Maybe we
> > > > > >> should note this in the manual.
> > > > > >>
> > > > > >> I have learned this from following blog (in Japanese).
> > > > > >>
> > > > > >> http://www.nminoru.jp/~nminoru/diary/2014/02.html#20140203p1
> > > > > >>
> > > > > >> Best regards,
> > > > > >> --
> > > > > >> Tatsuo Ishii
> > > > > >> SRA OSS, Inc. Japan
> > > > > >> English: http://www.sraoss.co.jp/index_en.php
> > > > > >> Japanese:http://www.sraoss.co.jp
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Bo Peng <pengbo at sraoss.co.jp>
> > > > > > SRA OSS, Inc. Japan
> > > >
> > > >
> > > > --
> > > > Bo Peng <pengbo at sraoss.co.jp>
> > > > SRA OSS, Inc. Japan
> > > >
> >
> >
> > --
> > Bo Peng <pengbo at sraoss.co.jp>
> > SRA OSS, Inc. Japan
> >
--
Bo Peng <pengbo at sraoss.co.jp>
SRA OSS, Inc. Japan
More information about the pgpool-hackers
mailing list