[pgpool-hackers: 3974] Re: [pgpool-general: 7543] VIP with one node

Tatsuo Ishii ishii at sraoss.co.jp
Thu Jul 15 14:42:39 JST 2021


Hi Usama,

I am trying to understand your proposal. Please correct me if I am
wrong.  It seems the proposal just gives up the concept of quorum. For
example, we start with 3-node cluster A, B and C.  Due to a network
problem, C is separated with A and B. A and B can still
communicate. After wd_lost_node_to_remove_timeout passed, A, B become
a 2-node cluster with quorum. C becomes a 1-node cluster with
quorum. So a split brain occurs.

Am I missing something?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> Hi,
> 
> I have been thinking about this issue and I believe the concerns are genuine
> and we need to figure out a way around.
> 
> IMHO one possible solution is to change how watchdog does the quorum
> calculations
> and which nodes makes up the watchdog cluster.
> 
> The current implementation calculates the quorum based on the number of
> configured
> watchdog nodes and alive nodes. And if we make the watchdog cluster adjust
> itself dynamically
> based on the current situation, then we can have a better user experience.
> 
> As of now the watchdog cluster definition recognises node as either alive
> or absent.
> And the number of alive-nodes need to be >= to the total number of
> configured nodes
> for the quorum to hold.
> 
> So my suggestion is that instead of using a binary status, we consider that
> watchdog node
> can be in one of three states 'Alive', 'Dead' or 'Lost', and all dead nodes
> should be considered
> as not part of the current cluster.
> 
> Consider the example where we have 5 configured watchdog nodes.
> With current implementation the quorum will require 3 alive nodes.
> 
> Now suppose we have started only 3 nodes. That would be good enough to make
> the cluster
> hold the quorum and one of the nodes will eventually acquire the VIP, so no
> problems there.
> But as soon as we shutdown one of the nodes or it becomes 'Lost' the
> cluster will lose the
> quorum and release the VIP, making the service unavailable.
> 
> Consider the same scenario, with above mentioned new definition of watchdog
> cluster.
> When we initially start 3 nodes out of 5 the cluster marks the remaining
> two nodes
> (after configurable time) as dead, and removes them for the cluster until
> one of those nodes
> is started and connects with the cluster. So after that configured time,
> even if we have 5 configured
> watchdog nodes our cluster dynamically adjusts itself and considers the
> cluster having
> only 3 nodes (instead of 5) and that will require only 2 nodes be alive.
> 
> By this new definition if one of the node gets lost, the cluster will still
> hold the quorum
> since it was considering it consists of 3 nodes. And that lost node will
> again be marked
> as dead after a configured amount of time and eventually further shrink the
> cluster size to 2 nodes.
> Similarly, when some previously dead node joins the cluster, the cluster
> will expend itself again to
> accommodate that node.
> 
> On top of that if some watchdog node is properly shutdown then it would be
> Immediately
> marked as dead and removed from the cluster.
> 
> Of course, this is not a bullet-proof and comes with the risk of having a
> split-brain in case of
> a few network partitioning scenarios, but I think it would work in 99% of
> cases.
> 
> This new implementation would require two new (proposed) additional
> configuration parameter.
> 1- wd_lost_node_to_remove_timeout (seconds)
> 2- wd_initial_node_showup_time (seconds)
> 
> Also, we can also implement a new PCP command to force the lost node to be
> marked as dead.
> 
> Thoughts and suggestions?
> 
> Thanks
> Best regards
> Muhammad Usama
> 
> On Tue, May 11, 2021 at 7:18 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> 
>> Hi Pgpool-II developers,
>>
>> Recently we got a complain below from a user.
>>
>> Currently Pgpool-II releases VIP if the quorum is lost.  This is
>> reasonable and safe so that we can prevent split-brain problems.
>>
>> However, I feel it would be nice if there's a way to allow to hold VIP
>> even if the quorum is lost for emergency sake.
>>
>> Suppose we have 3-node pgpool each in different 3 cities. Those 2
>> cities are break down by an earth quake, and user want to keep their
>> business relying on the remaining 1 node. Of course we could disable
>> watchdog and restart pgpool so that applications can directly connect
>> to pgpool. However in this case applications need to change the IP
>> which connect to.
>>
>> Also as the user pointed out, with 2-node configuration the VIP can be
>> used by enabling enable_consensus_with_half_vote even if there is
>> only 1 node remains. It seems as if 2-node config is better than
>> 3-node config in this regard. Of course this is not true since 3-node
>> config is much more resistant to split-brain problems.
>>
>> I think there are multiple ways to deal with the problem:
>>
>> 1) invent a new config parameter so that pgpool keeps VIP even if the
>> quorum is lost.
>>
>> 2) add a new pcp command which re-attaches the VIP after VIP is lost
>> due to loss of the quorum.
>>
>> #1 could easily creates duplicate VIPs. #2 looks better but when other
>>  nodes come up, it could be possible that duplicate VIPs are created.
>>
>> Thoughts?
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>
>> > Dear all,
>> >
>> > I have fairly common 3-node cluster, with each node running a PgPool
>> > and a PostreSQL instance.
>> >
>> > I have set up priorities so that:
>> >   - when all 3 nodes are up, the 1st node is gonna have the VIP,
>> >   - when the 1st node is down, the 2nd node is gonna have the VIP, and
>> >   - when both the 1st and the 2nd nodes are down, then the 3rd node
>> > should get the VIP.
>> >
>> > My problem is that when only 1 node is up, the VIP is not brought up,
>> > because there is no quorum.
>> > How can I get PgPool to bring up the VIP to the only remaining node,
>> > which still could and should serve requests?
>> >
>> > Regards,
>> >
>> > tamas
>> >
>> > --
>> > Rébeli-Szabó Tamás
>> >
>> > _______________________________________________
>> > pgpool-general mailing list
>> > pgpool-general at pgpool.net
>> > http://www.pgpool.net/mailman/listinfo/pgpool-general
>> _______________________________________________
>> pgpool-hackers mailing list
>> pgpool-hackers at pgpool.net
>> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>>


More information about the pgpool-hackers mailing list