[pgpool-hackers: 2533] Re: New Feature with patch: Quorum and Consensus for backend failover
Tatsuo Ishii
ishii at sraoss.co.jp
Wed Sep 13 07:51:03 JST 2017
> Hi Ishii-San
>
> On Mon, Sep 11, 2017 at 11:01 AM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>
>> Usama,
>>
>> I have modified watchdog regression script to test out your quorum
>> aware failover patch. Here are differences from the existing script.
>>
>> - Install streaming replication primary and standby DB node (before
>> raw mode + only 1 node). This is necessary to test an ordinary
>> failover scenario. Current script creates only one DB node. So if we
>> get the node down, the cluster goes into "all db node down"
>> status. I already reported possible problem with the status up
>> thread and waiting for the answer from Usama.
>>
>> - Add one more pgpool-II node "standby2". For this purpose, new
>> configuration file "standby2.conf" added.
>>
>> - Add new test scenario: "fake" failover. By using the infrastructure
>> I have created to simulate the communication path between standby2
>> and DB node 1. The test checks if such a error raised a failover
>> request from standby2, but it is safely ignored.
>>
>> - Add new test scenario: "real" failover. Shutting down DB node 1,
>> which should raise a failover request.
>>
>> - Modify test.sh to agree with the changes above.
>>
>
>
> Thanks for testing and test script patch. I think the changes and test
> scenario is spot on but I think we should add a new test case
> with these modifications and keep the 004_watchdog test case in the same
> shape as-well, This will help us to identify watchdog issues more
> swiftly.
Agreed. Once your quorum patch committed, I will add the test as
011.watchdog_quorum_failover. (if you have a better idea for the test
name, please let me know).
>> Since the modification requires your quorum patch committed, I haven't
>> push the change yet.
>>
>
> I was worried about the scenario mentioned above and now identified it as
> an existing issue, So will commit the patch tomorrow.
Ok.
> Thanks
> Best regardsd
> Muhammad Usama
>
>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>
>> > I have tested the patch a little bit using 004 watchdog regression
>> > test. After the test ends, I manually started master and standby
>> > Pgpool-II.
>> >
>> > 1) Stop master PostgreSQL. Since there's only one PostgreSQL is
>> > configured, I expected:
>> >
>> > psql: ERROR: pgpool is not accepting any new connections
>> > DETAIL: all backend nodes are down, pgpool requires at least one valid
>> node
>> > HINT: repair the backend nodes and restart pgpool
>> >
>> > but master Pgpool-II replies:
>> >
>> > psql: FATAL: failed to create a backend connection
>> > DETAIL: executing failover on backend
>> >
>> > Is this normal?
>> >
>> > 2) I shutdown the master node to see if the standby escalates.
>> >
>> > After shutting down the master, I see this using pcp_watchdog_info:
>> >
>> > pcp_watchdog_info -p 11105
>> > localhost:11100 Linux tishii-CF-SX3HE4BP localhost 11100 21104 4 MASTER
>> > localhost:11000 Linux tishii-CF-SX3HE4BP localhost 11000 21004 10
>> SHUTDOWN
>> >
>> > Seems ok but I want to confirm.
>> >
>> > master and standby pgpool logs attached.
>> >
>> > Best regards,
>> > --
>> > Tatsuo Ishii
>> > SRA OSS, Inc. Japan
>> > English: http://www.sraoss.co.jp/index_en.php
>> > Japanese:http://www.sraoss.co.jp
>> >
>> >> On Fri, Aug 25, 2017 at 5:05 PM, Tatsuo Ishii <ishii at sraoss.co.jp>
>> wrote:
>> >>
>> >>> > On Fri, Aug 25, 2017 at 12:53 PM, Tatsuo Ishii <ishii at sraoss.co.jp>
>> >>> wrote:
>> >>> >
>> >>> >> Usama,
>> >>> >>
>> >>> >> With the new patch, the regression tests all passed.
>> >>> >>
>> >>> >
>> >>> > Glad to hear that :-)
>> >>> > Did you had a chance to look at the node quarantine state I added.
>> What
>> >>> are
>> >>> > your thoughts on that ?
>> >>>
>> >>> I'm going to look into the patch this weekend.
>> >>>
>> >>
>> >> Many thanks
>> >>
>> >> Best Regards
>> >> Muhammad Usama
>> >>
>> >>>
>> >>> Best regards,
>> >>> --
>> >>> Tatsuo Ishii
>> >>> SRA OSS, Inc. Japan
>> >>> English: http://www.sraoss.co.jp/index_en.php
>> >>> Japanese:http://www.sraoss.co.jp
>> >>>
>> >>> >> > Hi Ishii-San
>> >>> >> >
>> >>> >> > Please fine the updated patch, It fixes the regression issue you
>> were
>> >>> >> > facing and also another bug which I encountered during my testing.
>> >>> >> >
>> >>> >> > -- Adding Yugo to the thread,
>> >>> >> > Hi Yugo,
>> >>> >> >
>> >>> >> > Since you are an expert of watchdog feature, So I thought you
>> might
>> >>> have
>> >>> >> > something to say especially regarding the discussion points
>> mentioned
>> >>> in
>> >>> >> > the initial mail.
>> >>> >> >
>> >>> >> >
>> >>> >> > Thanks
>> >>> >> > Best Regards
>> >>> >> > Muhammad Usama
>> >>> >> >
>> >>> >> >
>> >>> >> > On Thu, Aug 24, 2017 at 11:25 AM, Muhammad Usama <
>> m.usama at gmail.com>
>> >>> >> wrote:
>> >>> >> >
>> >>> >> >>
>> >>> >> >>
>> >>> >> >> On Thu, Aug 24, 2017 at 4:34 AM, Tatsuo Ishii <
>> ishii at sraoss.co.jp>
>> >>> >> wrote:
>> >>> >> >>
>> >>> >> >>> After applying the patch, many of regression tests fail. It
>> seems
>> >>> >> >>> pgpool.conf.sample has bogus comment which causes the
>> pgpool.conf
>> >>> >> >>> parser to complain parse error.
>> >>> >> >>>
>> >>> >> >>> 2017-08-24 08:22:36: pid 6017: FATAL: syntex error in
>> configuration
>> >>> >> file
>> >>> >> >>> "/home/t-ishii/work/pgpool-II/current/pgpool2/src/test/regre
>> >>> >> >>> ssion/tests/004.watchdog/standby/etc/pgpool.conf"
>> >>> >> >>> 2017-08-24 08:22:36: pid 6017: DETAIL: parse error at line 568
>> '*'
>> >>> >> token
>> >>> >> >>> = 8
>> >>> >> >>>
>> >>> >> >>
>> >>> >> >> Really sorry, Somehow I overlooked the sample config file
>> changes I
>> >>> made
>> >>> >> >> at the last minute.
>> >>> >> >> Will send you the updated version.
>> >>> >> >>
>> >>> >> >> Thanks
>> >>> >> >> Best Regards
>> >>> >> >> Muhammad Usama
>> >>> >> >>
>> >>> >> >>>
>> >>> >> >>> Best regards,
>> >>> >> >>> --
>> >>> >> >>> Tatsuo Ishii
>> >>> >> >>> SRA OSS, Inc. Japan
>> >>> >> >>> English: http://www.sraoss.co.jp/index_en.php
>> >>> >> >>> Japanese:http://www.sraoss.co.jp
>> >>> >> >>>
>> >>> >> >>> > Usama,
>> >>> >> >>> >
>> >>> >> >>> > Thanks for the patch. I am going to review it.
>> >>> >> >>> >
>> >>> >> >>> > In the mean time when I apply your patch, I got some trailing
>> >>> >> >>> > whitespace errors. Can you please fix them?
>> >>> >> >>> >
>> >>> >> >>> > /home/t-ishii/quorum_aware_failover.diff:470: trailing
>> >>> whitespace.
>> >>> >> >>> >
>> >>> >> >>> > /home/t-ishii/quorum_aware_failover.diff:485: trailing
>> >>> whitespace.
>> >>> >> >>> >
>> >>> >> >>> > /home/t-ishii/quorum_aware_failover.diff:564: trailing
>> >>> whitespace.
>> >>> >> >>> >
>> >>> >> >>> > /home/t-ishii/quorum_aware_failover.diff:1428: trailing
>> >>> whitespace.
>> >>> >> >>> >
>> >>> >> >>> > /home/t-ishii/quorum_aware_failover.diff:1450: trailing
>> >>> whitespace.
>> >>> >> >>> >
>> >>> >> >>> > warning: squelched 3 whitespace errors
>> >>> >> >>> > warning: 8 lines add whitespace errors.
>> >>> >> >>> >
>> >>> >> >>> > Best regards,
>> >>> >> >>> > --
>> >>> >> >>> > Tatsuo Ishii
>> >>> >> >>> > SRA OSS, Inc. Japan
>> >>> >> >>> > English: http://www.sraoss.co.jp/index_en.php
>> >>> >> >>> > Japanese:http://www.sraoss.co.jp
>> >>> >> >>> >
>> >>> >> >>> >> Hi
>> >>> >> >>> >>
>> >>> >> >>> >> I was working on the new feature to make the backend node
>> >>> failover
>> >>> >> >>> quorum
>> >>> >> >>> >> aware and on the half way through the implementation I also
>> added
>> >>> >> the
>> >>> >> >>> >> majority consensus feature for the same.
>> >>> >> >>> >>
>> >>> >> >>> >> So please find the first version of the patch for review that
>> >>> makes
>> >>> >> the
>> >>> >> >>> >> backend node failover consider the watchdog cluster quorum
>> status
>> >>> >> and
>> >>> >> >>> seek
>> >>> >> >>> >> the majority consensus before performing failover.
>> >>> >> >>> >>
>> >>> >> >>> >> *Changes in the Failover mechanism with watchdog.*
>> >>> >> >>> >> For this new feature I have modified the Pgpool-II's existing
>> >>> >> failover
>> >>> >> >>> >> mechanism with watchdog.
>> >>> >> >>> >> Previously as you know when the Pgpool-II require to perform
>> a
>> >>> node
>> >>> >> >>> >> operation (failover, failback, promote-node) with the
>> watchdog.
>> >>> The
>> >>> >> >>> >> watchdog used to propagated the failover request to all the
>> >>> >> Pgpool-II
>> >>> >> >>> nodes
>> >>> >> >>> >> in the watchdog cluster and as soon as the request was
>> received
>> >>> by
>> >>> >> the
>> >>> >> >>> >> node, it used to initiate the local failover and that
>> failover
>> >>> was
>> >>> >> >>> >> synchronised on all nodes using the distributed locks.
>> >>> >> >>> >>
>> >>> >> >>> >> *Now Only the Master node performs the failover.*
>> >>> >> >>> >> The attached patch changes the mechanism of synchronised
>> >>> failover,
>> >>> >> and
>> >>> >> >>> now
>> >>> >> >>> >> only the Pgpool-II of master watchdog node performs the
>> failover,
>> >>> >> and
>> >>> >> >>> all
>> >>> >> >>> >> other standby nodes sync the backend statuses after the
>> master
>> >>> >> >>> Pgpool-II is
>> >>> >> >>> >> finished with the failover.
>> >>> >> >>> >>
>> >>> >> >>> >> *Overview of new failover mechanism.*
>> >>> >> >>> >> -- If the failover request is received to the standby
>> watchdog
>> >>> >> >>> node(from
>> >>> >> >>> >> local Pgpool-II), That request is forwarded to the master
>> >>> watchdog
>> >>> >> and
>> >>> >> >>> the
>> >>> >> >>> >> Pgpool-II main process is returned with the
>> >>> >> FAILOVER_RES_WILL_BE_DONE
>> >>> >> >>> >> return code. And upon receiving the FAILOVER_RES_WILL_BE_DONE
>> >>> from
>> >>> >> the
>> >>> >> >>> >> watchdog for the failover request the requesting Pgpool-II
>> moves
>> >>> >> >>> forward
>> >>> >> >>> >> without doing anything further for the particular failover
>> >>> command.
>> >>> >> >>> >>
>> >>> >> >>> >> -- Now when the failover request from standby node is
>> received by
>> >>> >> the
>> >>> >> >>> >> master watchdog, after performing the validation, applying
>> the
>> >>> >> >>> consensus
>> >>> >> >>> >> rules the failover request is triggered on the local
>> Pgpool-II .
>> >>> >> >>> >>
>> >>> >> >>> >> -- When the failover request is received to the master
>> watchdog
>> >>> node
>> >>> >> >>> from
>> >>> >> >>> >> the local Pgpool-II (On the IPC channel) the watchdog process
>> >>> inform
>> >>> >> >>> the
>> >>> >> >>> >> Pgpool-II requesting process to proceed with failover
>> (provided
>> >>> all
>> >>> >> >>> >> failover rules are satisfied).
>> >>> >> >>> >>
>> >>> >> >>> >> -- After the failover is finished on the master Pgpool-II,
>> the
>> >>> >> failover
>> >>> >> >>> >> function calls the *wd_failover_end*() which sends the
>> backend
>> >>> sync
>> >>> >> >>> >> required message to all standby watchdogs.
>> >>> >> >>> >>
>> >>> >> >>> >> -- Upon receiving the sync required message from master
>> watchdog
>> >>> >> node
>> >>> >> >>> all
>> >>> >> >>> >> Pgpool-II sync the new statuses of each backend node from the
>> >>> master
>> >>> >> >>> >> watchdog.
>> >>> >> >>> >>
>> >>> >> >>> >> *No More Failover locks*
>> >>> >> >>> >> Since with this new failover mechanism we do not require any
>> >>> >> >>> >> synchronisation and guards against the execution of
>> >>> >> failover_commands
>> >>> >> >>> by
>> >>> >> >>> >> multiple Pgpool-II nodes, So the patch removes all the
>> >>> distributed
>> >>> >> >>> locks
>> >>> >> >>> >> from failover function, This makes the failover simpler and
>> >>> faster.
>> >>> >> >>> >>
>> >>> >> >>> >> *New kind of Failover operation NODE_QUARANTINE_REQUEST*
>> >>> >> >>> >> The patch adds the new kind of backend node operation
>> >>> >> NODE_QUARANTINE
>> >>> >> >>> which
>> >>> >> >>> >> is effectively same as the NODE_DOWN, but with
>> node_quarantine
>> >>> the
>> >>> >> >>> >> failover_command is not triggered.
>> >>> >> >>> >> The NODE_DOWN_REQUEST is automatically converted to the
>> >>> >> >>> >> NODE_QUARANTINE_REQUEST when the failover is requested on the
>> >>> >> backend
>> >>> >> >>> node
>> >>> >> >>> >> but watchdog cluster does not holds the quorum.
>> >>> >> >>> >> This means in the absence of quorum the failed backend nodes
>> are
>> >>> >> >>> >> quarantined and when the quorum becomes available again the
>> >>> >> Pgpool-II
>> >>> >> >>> >> performs the failback operation on all quarantine nodes.
>> >>> >> >>> >> And again when the failback is performed on the quarantine
>> >>> backend
>> >>> >> >>> node the
>> >>> >> >>> >> failover function does not trigger the failback_command.
>> >>> >> >>> >>
>> >>> >> >>> >> *Controlling the Failover behaviour.*
>> >>> >> >>> >> The patch adds three new configuration parameters to
>> configure
>> >>> the
>> >>> >> >>> failover
>> >>> >> >>> >> behaviour from user side.
>> >>> >> >>> >>
>> >>> >> >>> >> *failover_when_quorum_exists*
>> >>> >> >>> >> When enabled the failover command will only be executed when
>> the
>> >>> >> >>> watchdog
>> >>> >> >>> >> cluster holds the quorum. And when the quorum is absent and
>> >>> >> >>> >> failover_when_quorum_exists is enabled the failed backend
>> nodes
>> >>> will
>> >>> >> >>> get
>> >>> >> >>> >> quarantine until the quorum becomes available again.
>> >>> >> >>> >> disabling it will enable the old behaviour of failover
>> commands.
>> >>> >> >>> >>
>> >>> >> >>> >>
>> >>> >> >>> >> *failover_require_consensus*This new configuration parameter
>> >>> can be
>> >>> >> >>> used to
>> >>> >> >>> >> make sure we get the majority vote before performing the
>> >>> failover on
>> >>> >> >>> the
>> >>> >> >>> >> node. When *failover_require_consensus* is enabled then the
>> >>> >> failover is
>> >>> >> >>> >> only performed after receiving the failover request from the
>> >>> >> majority
>> >>> >> >>> or
>> >>> >> >>> >> Pgpool-II nodes.
>> >>> >> >>> >> For example in three nodes cluster the failover will not be
>> >>> >> performed
>> >>> >> >>> until
>> >>> >> >>> >> at least two nodes ask for performing the failover on the
>> >>> particular
>> >>> >> >>> >> backend node.
>> >>> >> >>> >>
>> >>> >> >>> >> It is also worthwhile to mention here that
>> >>> >> *failover_require_consensus*
>> >>> >> >>> >> only works when failover_when_quorum_exists is enables.
>> >>> >> >>> >>
>> >>> >> >>> >>
>> >>> >> >>> >> *enable_multiple_failover_requests_from_node*
>> >>> >> >>> >> This parameter works in connection with
>> >>> *failover_require_consensus*
>> >>> >> >>> >> config. When enabled a single Pgpool-II node can vote for
>> >>> failover
>> >>> >> >>> multiple
>> >>> >> >>> >> times.
>> >>> >> >>> >> For example in the three nodes cluster if one Pgpool-II node
>> >>> sends
>> >>> >> the
>> >>> >> >>> >> failover request of particular node twice that would be
>> counted
>> >>> as
>> >>> >> two
>> >>> >> >>> >> votes in favour of failover and the failover will be
>> performed
>> >>> even
>> >>> >> if
>> >>> >> >>> we
>> >>> >> >>> >> do not get a vote from other two nodes.
>> >>> >> >>> >>
>> >>> >> >>> >> And when *enable_multiple_failover_requests_from_node* is
>> >>> disabled,
>> >>> >> >>> Only
>> >>> >> >>> >> the first vote from each Pgpool-II will be accepted and all
>> other
>> >>> >> >>> >> subsequent votes will be marked duplicate and rejected.
>> >>> >> >>> >> So in that case we will require a majority votes from
>> distinct
>> >>> >> nodes to
>> >>> >> >>> >> execute the failover.
>> >>> >> >>> >> Again this *enable_multiple_failover_requests_from_node*
>> only
>> >>> >> becomes
>> >>> >> >>> >> effective when both *failover_when_quorum_exists* and
>> >>> >> >>> >> *failover_require_consensus* are enabled.
>> >>> >> >>> >>
>> >>> >> >>> >>
>> >>> >> >>> >> *Controlling the failover: The Coding perspective.*
>> >>> >> >>> >> Although the failover functions are made quorum and consensus
>> >>> aware
>> >>> >> but
>> >>> >> >>> >> there is still a way to bypass the quorum conditions, and
>> >>> >> requirement
>> >>> >> >>> of
>> >>> >> >>> >> consensus.
>> >>> >> >>> >>
>> >>> >> >>> >> For this the patch uses the existing request_details flags in
>> >>> >> >>> >> POOL_REQUEST_NODE to control the behaviour of failover.
>> >>> >> >>> >>
>> >>> >> >>> >> Here are the newly added flags values.
>> >>> >> >>> >>
>> >>> >> >>> >> *REQ_DETAIL_WATCHDOG*:
>> >>> >> >>> >> Setting this flag while issuing the failover command will not
>> >>> send
>> >>> >> the
>> >>> >> >>> >> failover request to the watchdog. But this flag may not be
>> >>> useful in
>> >>> >> >>> any
>> >>> >> >>> >> other place than where it is already used.
>> >>> >> >>> >> Mostly this flag can be used to avoid the failover command
>> from
>> >>> >> going
>> >>> >> >>> to
>> >>> >> >>> >> watchdog that is already originated from watchdog. Otherwise
>> we
>> >>> can
>> >>> >> >>> end up
>> >>> >> >>> >> in infinite loop.
>> >>> >> >>> >>
>> >>> >> >>> >> *REQ_DETAIL_CONFIRMED*:
>> >>> >> >>> >> Setting this flag will bypass the
>> *failover_require_consensus*
>> >>> >> >>> >> configuration and immediately perform the failover if quorum
>> is
>> >>> >> >>> present.
>> >>> >> >>> >> This flag can be used to issue the failover request
>> originated
>> >>> from
>> >>> >> PCP
>> >>> >> >>> >> command.
>> >>> >> >>> >>
>> >>> >> >>> >> *REQ_DETAIL_UPDATE*:
>> >>> >> >>> >> This flag is used for the command where we are failing back
>> the
>> >>> >> >>> quarantine
>> >>> >> >>> >> nodes. Setting this flag will not trigger the
>> failback_command.
>> >>> >> >>> >>
>> >>> >> >>> >> *Some conditional flags used:*
>> >>> >> >>> >> I was not sure about the configuration of each type of
>> failover
>> >>> >> >>> operation.
>> >>> >> >>> >> As we have three main failover operations NODE_UP_REQUEST,
>> >>> >> >>> >> NODE_DOWN_REQUEST, and PROMOTE_NODE_REQUEST
>> >>> >> >>> >> So I was thinking do we need to give the configuration
>> option to
>> >>> the
>> >>> >> >>> users,
>> >>> >> >>> >> if they want to enable/disable quorum checking and consensus
>> for
>> >>> >> >>> individual
>> >>> >> >>> >> failover operation type.
>> >>> >> >>> >> For example: is it a practical configuration where a user
>> would
>> >>> >> want to
>> >>> >> >>> >> ensure quorum while preforming NODE_DOWN operation while
>> does not
>> >>> >> want
>> >>> >> >>> it
>> >>> >> >>> >> for NODE_UP.
>> >>> >> >>> >> So in this patch I use three compile time defines to enable
>> >>> disable
>> >>> >> the
>> >>> >> >>> >> individual failover operation, while we can decide on the
>> best
>> >>> >> >>> solution.
>> >>> >> >>> >>
>> >>> >> >>> >> NODE_UP_REQUIRE_CONSENSUS: defining it will enable quorum
>> >>> checking
>> >>> >> >>> feature
>> >>> >> >>> >> for NODE_UP_REQUESTs
>> >>> >> >>> >>
>> >>> >> >>> >> NODE_DOWN_REQUIRE_CONSENSUS: defining it will enable quorum
>> >>> checking
>> >>> >> >>> >> feature for NODE_DOWN_REQUESTs
>> >>> >> >>> >>
>> >>> >> >>> >> NODE_PROMOTE_REQUIRE_CONSENSUS: defining it will enable
>> quorum
>> >>> >> >>> checking
>> >>> >> >>> >> feature for PROMOTE_NODE_REQUESTs
>> >>> >> >>> >>
>> >>> >> >>> >> *Some Point for Discussion:*
>> >>> >> >>> >>
>> >>> >> >>> >> *Do we really need to check ReqInfo->switching flag before
>> >>> enqueuing
>> >>> >> >>> >> failover request.*
>> >>> >> >>> >> While working on the patch I was wondering why do we disallow
>> >>> >> >>> enqueuing the
>> >>> >> >>> >> failover command when the failover is already in progress?
>> For
>> >>> >> example
>> >>> >> >>> in
>> >>> >> >>> >> *pcp_process_command*() function if we see the
>> >>> *Req_info->switching*
>> >>> >> >>> flag
>> >>> >> >>> >> set we bailout with the error instead of enqueuing the
>> command.
>> >>> Is
>> >>> >> is
>> >>> >> >>> >> really necessary?
>> >>> >> >>> >>
>> >>> >> >>> >> *Do we need more granule control over each failover
>> operation:*
>> >>> >> >>> >> As described in section "Some conditional flags used" I want
>> the
>> >>> >> >>> opinion on
>> >>> >> >>> >> do we need configuration parameters in pgpool.conf to enable
>> >>> disable
>> >>> >> >>> quorum
>> >>> >> >>> >> and consensus checking on individual failover types.
>> >>> >> >>> >>
>> >>> >> >>> >> *Which failover should be mark as Confirmed:*
>> >>> >> >>> >> As defined in the above section of REQ_DETAIL_CONFIRMED, We
>> can
>> >>> mark
>> >>> >> >>> the
>> >>> >> >>> >> failover request to not need consensus, currently the
>> requests
>> >>> from
>> >>> >> >>> the PCP
>> >>> >> >>> >> commands are fired with this flag. But I was wondering there
>> may
>> >>> be
>> >>> >> >>> more
>> >>> >> >>> >> places where we many need to use the flag.
>> >>> >> >>> >> For example I currently use the same confirmed flag when
>> >>> failover is
>> >>> >> >>> >> triggered because of *replication_stop_on_mismatch*.
>> >>> >> >>> >>
>> >>> >> >>> >> I think we should think this flag for each place of failover,
>> >>> like
>> >>> >> >>> when the
>> >>> >> >>> >> failover is triggered
>> >>> >> >>> >> because of health_check failure.
>> >>> >> >>> >> because of replication mismatch
>> >>> >> >>> >> because of backend_error
>> >>> >> >>> >> e.t.c
>> >>> >> >>> >>
>> >>> >> >>> >> *Node Quarantine behaviour.*
>> >>> >> >>> >> What do you think about the node quarantine used by this
>> patch.
>> >>> Can
>> >>> >> you
>> >>> >> >>> >> think of some problem which can be caused by this?
>> >>> >> >>> >>
>> >>> >> >>> >> *What should be the default values for each newly added
>> config
>> >>> >> >>> parameters.*
>> >>> >> >>> >>
>> >>> >> >>> >>
>> >>> >> >>> >>
>> >>> >> >>> >> *TODOs*
>> >>> >> >>> >>
>> >>> >> >>> >> -- Updating the documentation is still todo. Will do that
>> once
>> >>> every
>> >>> >> >>> aspect
>> >>> >> >>> >> of the feature will be finalised.
>> >>> >> >>> >> -- Some code warnings and cleanups are still not done.
>> >>> >> >>> >> -- I am still little short on testing
>> >>> >> >>> >> -- Regression test cases for the feature
>> >>> >> >>> >>
>> >>> >> >>> >>
>> >>> >> >>> >> Thoughts and suggestions are most welcome.
>> >>> >> >>> >>
>> >>> >> >>> >> Thanks
>> >>> >> >>> >> Best regards
>> >>> >> >>> >> Muhammad Usama
>> >>> >> >>> > _______________________________________________
>> >>> >> >>> > pgpool-hackers mailing list
>> >>> >> >>> > pgpool-hackers at pgpool.net
>> >>> >> >>> > http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>> >>> >> >>>
>> >>> >> >>
>> >>> >> >>
>> >>> >>
>> >>>
>>
More information about the pgpool-hackers
mailing list