Difference between revisions of "watchdog feature enhancement"
(Created page with "==== What's driving this enhancement ==== :Watchdog is a very important feature of pgpool-II as it is used to eliminate the single point of failure and provide HA. But there are ...") |
(→Some pgpool-II general mailing list threads related to watchdog) |
||
Line 28: | Line 28: | ||
: 4-- Wants watchdog on cloud and active-active watchdog configurations | : 4-- Wants watchdog on cloud and active-active watchdog configurations | ||
− | ''' | + | :'''Still open Issues''' |
: -- [pgpool-general: 3724] delegate ip lost | : -- [pgpool-general: 3724] delegate ip lost | ||
Line 34: | Line 34: | ||
: -- [pgpool-general: 3228] Split brain or using 3 nodes ? | : -- [pgpool-general: 3228] Split brain or using 3 nodes ? | ||
: -- [pgpool-general: 3728] Re: pgpool-general Digest, Vol 43, Issue 17 | : -- [pgpool-general: 3728] Re: pgpool-general Digest, Vol 43, Issue 17 | ||
+ | |||
+ | What is required by the watchdog? | ||
+ | --------------------------------------------------- | ||
+ | The main purpose of the watchdog in pgpool-II is to provide high availability, For this purpose the watchdog is required to ensure following. | ||
+ | |||
+ | -- Ensure only healthy nodes are part of the cluster | ||
+ | -- Ensure only authorized nodes can become the member of the cluster | ||
+ | -- Ensure only one pgpool-II node is a designated master node at any time | ||
+ | -- Provide the automatic recovery mechanism when possible when some problem occurs |
Revision as of 15:30, 15 June 2015
What's driving this enhancement
- Watchdog is a very important feature of pgpool-II as it is used to eliminate the single point of failure and provide HA. But there are few feature requests and bugs in the existing watchdog that require little more than a simple code fix, and requires the complete revisit of its core architecture. So this enhancement of watchdog is aimed at providing the stability and robustness to the existing pgpool-II watchdog with some new cool features.
- -- [pgpool-general: 3724] delegate ip lost
- -- [pgpool-II 0000135]: Delegate IP does not get up on Standby upon Active gets disconnected (same in ppgool-general: 3736)
- -- Split-brain scenario due to network partitioning
- -- [ppgool-general: 3595] Watchdog issue.
- -- [pgpool-general: 3443] watchdog on cloud
- -- [pgpool-general: 3126] watchdog voting
- -- [pgpool-general: 2985] Re: Connections stuck in CLOSE_WAIT, again
- -- [pgpool-general: 2949] Re: pgpool 3.3.3 watchdog problem
- -- [pgpool-general: 2797] pcp_watchdog_info parameters
- -- [pgpool-general: 2768] timeout Watchdog
- -- [pgpool-general: 2427] watchdog quorum
- -- [pgpool-general: 2418] Re: watchdog: different statuses on different pgpool nodes.
- -- [pgpool-general: 3772] Race condition for VIP assignment
- -- Lots of question on suid or root privileges are required ([pgpool-general: 3323] Re: Watchdog - ifconfig up failed)
- -- User wants ACTIVE-ACTIVE pgpool-II configuration and miscellaneous comments on the difficulty in configuration of watchdog
- Summary
- Analyzing above pgpool-II community threads related to watchdog, It comes down to four main areas where current pgpool-II watchdog requires some enhancements.
- 1-- Related to Virtual IP assignments and handling the case of lost of VIP
- 2-- Split-brain scenario, recovery from it and watchdog quorum
- 3-- Users run into misconfigured watchdog situations very often.
- 4-- Wants watchdog on cloud and active-active watchdog configurations
- Still open Issues
- -- [pgpool-general: 3724] delegate ip lost
- -- [pgpool-general: 3772] Race condition for VIP assignment
- -- [pgpool-general: 3228] Split brain or using 3 nodes ?
- -- [pgpool-general: 3728] Re: pgpool-general Digest, Vol 43, Issue 17
What is required by the watchdog?
The main purpose of the watchdog in pgpool-II is to provide high availability, For this purpose the watchdog is required to ensure following.
-- Ensure only healthy nodes are part of the cluster -- Ensure only authorized nodes can become the member of the cluster -- Ensure only one pgpool-II node is a designated master node at any time -- Provide the automatic recovery mechanism when possible when some problem occurs