<div dir="ltr"><div>Hi Ishii San</div><div><br></div><div>4.1 and 4.0 were causing a few conflicts with cherry-picking, and it was late. I pushed the same for both</div><div><br></div><div>Thanks</div><div>Best regards</div><div>Muhammad Usama</div><div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Dec 20, 2022 at 5:16 AM Tatsuo Ishii <<a href="mailto:ishii@sraoss.co.jp">ishii@sraoss.co.jp</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Usama,<br>

<br>

Thank you for pushing your fix but it seems the fix was only pushed<br>

down to 4.2 stable.  Since the quorum failover was introduced in 3.7,<br>

I think we need to push the fix to 4.1 and 4.0 at least (3.7 will be<br>

obsoleted soon, and we may not need to push to 3.7).<br>

<br>

Best reagards,<br>

--<br>

Tatsuo Ishii<br>

SRA OSS LLC<br>

English: <a href="http://www.sraoss.co.jp/index_en/" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_en/</a><br>

Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>

<br>

<br>

> Thank you! Looks perfect (and my test passed).<br>

> <br>

> Best reagards,<br>

> --<br>

> Tatsuo Ishii<br>

> SRA OSS LLC<br>

> English: <a href="http://www.sraoss.co.jp/index_en/" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_en/</a><br>

> Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>

> <br>

> <br>

>> Hi<br>

>> Sorry for the inconvenience. I guess I attached the wrong patch file<br>

>> previously.<br>

>> Please find the latest one that also changes the DEBUG to LOG, as suggested.<br>

>> <br>

>> Thanks<br>

>> best Regards<br>

>> Muhammad Usama<br>

>> <br>

>> <br>

>> On Mon, Dec 19, 2022 at 9:07 AM Tatsuo Ishii <<a href="mailto:ishii@sraoss.co.jp" target="_blank">ishii@sraoss.co.jp</a>> wrote:<br>

>> <br>

>>> > Hi Ishii San<br>

>>> ><br>

>>> > Thanks for the feedback. Please find the updated patch that takes care of<br>

>>> > transferring the per node health check params through WD_POOL_CONFIG_DATA<br>

>>> > and use those in calculations.<br>

>>> > Could you verify if the attached fixes the issue?<br>

>>><br>

>>> Sure.<br>

>>><br>

>>> The patch adds trailing spaces.<br>

>>><br>

>>> /home/t-ishii/fix_failover_command_timout_v2.diff:68: trailing whitespace.<br>

>>>                         int pn_failover_command_timeout =<br>

>>> pool_config->health_check_params[i].health_check_period +<br>

>>> /home/t-ishii/fix_failover_command_timout_v2.diff:69: trailing whitespace.<br>

>>><br>

>>> (pool_config->health_check_params[i].health_check_retry_delay *<br>

>>> /home/t-ishii/fix_failover_command_timout_v2.diff:118: trailing whitespace.<br>

>>>                         health_check_params_count =<br>

>>> config->backend_desc->num_backends;<br>

>>> warning: 3 lines add whitespace errors.<br>

>>><br>

>>> There is a compile error:<br>

>>> watchdog.c:8203:10: error: 。DEBUG「 undeclared (first use in this<br>

>>> function); did you mean 。DEBUG1「?<br>

>>>  8203 |  ereport(DEBUG,(errmsg("Setting failover command timeout to<br>

>>> %d",failover_command_timeout)));<br>

>>>       |          ^~~~~<br>

>>><br>

>>> We'd better to use "LOG", instead DEBUG* here? Because:<br>

>>> - the log message is not frequent<br>

>>> - the timeout value is an important information<br>

>>><br>

>>> What do you think?<br>

>>><br>

>>> > As for standard_packet_processor() is concerned, it Is intended to handle<br>

>>> > messages<br>

>>> > that do not require to be handled differently for each watchdog state.<br>

>>><br>

>>> Ok. Thank you for the explanation.<br>

>>><br>

>>> Best reagards,<br>

>>> --<br>

>>> Tatsuo Ishii<br>

>>> SRA OSS LLC<br>

>>> English: <a href="http://www.sraoss.co.jp/index_en/" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_en/</a><br>

>>> Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>

>>><br>

>>> > Thanks<br>

>>> > Best Regards<br>

>>> > Muhammad Usama<br>

>>> ><br>

>>> ><br>

>>> > On Sat, Dec 17, 2022 at 5:01 PM Tatsuo Ishii <<a href="mailto:ishii@sraoss.co.jp" target="_blank">ishii@sraoss.co.jp</a>> wrote:<br>

>>> ><br>

>>> >> >>> On Tue, Nov 29, 2022 at 3:27 AM Tatsuo Ishii <<a href="mailto:ishii@sraoss.co.jp" target="_blank">ishii@sraoss.co.jp</a>><br>

>>> >> wrote:<br>

>>> >> >>><br>

>>> >> >>>> >> Hi Ishii-San<br>

>>> >> >>>> >><br>

>>> >> >>>> >> Sorry for the delayed response.<br>

>>> >> >>>> ><br>

>>> >> >>>> > No problem.<br>

>>> >> >>>> ><br>

>>> >> >>>> >> With the attached fix I guess the failover objects will linger<br>

>>> on<br>

>>> >> >>>> forever<br>

>>> >> >>>> >> in case of a false alarm by a health check or small glitch.<br>

>>> >> >>>> ><br>

>>> >> >>>> > That's not good.<br>

>>> >> >>>> ><br>

>>> >> >>>> >> One way to get around the issue could be to compute<br>

>>> >> >>>> >> FAILOVER_COMMAND_FINISH_TIMEOUT based on the maximum value<br>

>>> >> >>>> >> of health_check_peroid across the cluster.<br>

>>> >> >>>> >> something like: failover_command_finish_timouut =<br>

>>> >> >>>> max(health_check_period)<br>

>>> >> >>>> >> * 2 = 60<br>

>>> >> >>>><br>

>>> >> >>>> After thinking more, I think we need to take account<br>

>>> >> >>>> health_check_max_retries and health_check_retry_delay as<br>

>>> >> >>>> well. i.e. instead of max(health_check_period), something like:<br>

>>> >> >>>> max(health_check_period + (health_check_retry_delay *<br>

>>> >> >>>> health_check_max_retries)).<br>

>>> >> >>>><br>

>>> >> >>>> What do you think?<br>

>>> >> >>>><br>

>>> >> >>><br>

>>> >> >>> Thanks for the valuable suggestions.<br>

>>> >> >>> Can you try out the attached patch to see if it solves the issue?<br>

>>> >> >><br>

>>> >> >> Unfortunately the patch did not pass my test case.<br>

>>> >> >><br>

>>> >> >> - 3 watchdog nodes and 2 PostgreSQL servers, streaming replication<br>

>>> >> >>   cluster (created by watchdog_setup). pgpool0 is the watchdog<br>

>>> leader.<br>

>>> >> >><br>

>>> >> >> - health_check_period = 300, health_check_max_retries = 0<br>

>>> >> >><br>

>>> >> >> - pgpool1 starts 120 seconds after pgpool0 starts<br>

>>> >> >><br>

>>> >> >> - pgpool2 does not start<br>

>>> >> >><br>

>>> >> >> - after watchdog cluster becomes ready, shutdown PostgreSQL node 1<br>

>>> >> (standby).<br>

>>> >> >><br>

>>> >> >> - wait for 600 seconds to expect a failover.<br>

>>> >> >><br>

>>> >> >> Unfortunately failover did not happen.<br>

>>> >> >><br>

>>> >> >> Attached is the test script and pgpool0 log.<br>

>>> >> >><br>

>>> >> >> To run the test:<br>

>>> >> >><br>

>>> >> >> - unpack test.tar.gz<br>

>>> >> >><br>

>>> >> >> - run prepare.sh<br>

>>> >> >>   $ sh prepare.sh<br>

>>> >> >>   This should create "testdir" directory with 3 watchdog node +<br>

>>> >> PostgreSQL 2 node cluster.<br>

>>> >> >><br>

>>> >> >> - cd testdir and run the test<br>

>>> >> >>   $ sh ../<a href="http://start.sg" rel="noreferrer" target="_blank">start.sg</a> -o 120<br>

>>> >> >>   This will start the test, "-o" specifies how long wait before<br>

>>> >> strating pgpool1.<br>

>>> >> ><br>

>>> >> > After the test failure, I examined the pgpool log on the pgpool leader<br>

>>> >> > node (node 0). It seems timeout was not updated as expected.<br>

>>> >> ><br>

>>> >> > 2022-12-17 08:07:11.419: watchdog pid 707483: LOG:  failover request<br>

>>> >> from 1 nodes with ID:42 is expired<br>

>>> >> > 2022-12-17 08:07:11.419: watchdog pid 707483: DETAIL:  marking the<br>

>>> >> failover object for removal. timeout: 15<br>

>>> >> ><br>

>>> >> > After looking into the code, I found update_failover_timeout() only<br>

>>> >> > examines "health_check_period".  I think you need to examine<br>

>>> >> > "health_check_period0" etc. as well and find the larget one for the<br>

>>> >> > timeout caliculation.<br>

>>> >> ><br>

>>> >> > By the way,<br>

>>> >> ><br>

>>> >> >> failover_command_timout<br>

>>> >> >> g_cluster.failover_command_timout<br>

>>> >> ><br>

>>> >> > I think "timout" should be "timeout".<br>

>>> >><br>

>>> >> I was trying to create a proof of concept patch for this:<br>

>>> >><br>

>>> >> > After looking into the code, I found update_failover_timeout() only<br>

>>> >> > examines "health_check_period".  I think you need to examine<br>

>>> >> > "health_check_period0" etc. as well and find the larget one for the<br>

>>> >> > timeout caliculation.<br>

>>> >><br>

>>> >> and noticed that update_failover_timeout() is called by<br>

>>> >> standard_packet_processor() when WD_POOL_CONFIG_DATA packet is<br>

>>> >> received like: update_failover_timeout(wdNode, standby_config); But<br>

>>> >> standby_config was created by get_pool_config_from_json() which does<br>

>>> >> not seem to create health_check_params in pool_config data. Also I<br>

>>> >> wonder why standard_packet_processor() needs to be called when<br>

>>> >> WD_POOL_CONFIG_DATA is recieved. Can't we simply omit the call to<br>

>>> >> update_failover_timeout() in this case?<br>

>>> >><br>

>>> >> Best reagards,<br>

>>> >> --<br>

>>> >> Tatsuo Ishii<br>

>>> >> SRA OSS LLC<br>

>>> >> English: <a href="http://www.sraoss.co.jp/index_en/" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_en/</a><br>

>>> >> Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>

>>> >><br>

>>><br>

> _______________________________________________<br>

> pgpool-hackers mailing list<br>

> <a href="mailto:pgpool-hackers@pgpool.net" target="_blank">pgpool-hackers@pgpool.net</a><br>

> <a href="http://www.pgpool.net/mailman/listinfo/pgpool-hackers" rel="noreferrer" target="_blank">http://www.pgpool.net/mailman/listinfo/pgpool-hackers</a><br>

</blockquote></div></div>