<div dir="ltr"><div>Hi Bo,</div>Thank you for your response!<div><br></div><div>> Could you try to disable wd_monitoring_interfaces_list and see if this issue is improved?<br></div><div>No. Actually I added these interfaces to try if it actually improves the situation :)</div><div>In the absence of this setting, the situation is still the same!</div><div><br></div><div>What I notice is that to make the 'stuck/frozen' commands (watchdog + psql show pool_nodes) respond/work again,</div><div>I need to reboot the entire VM.</div><div>Is there a way to make the commands respond without a restart?</div><div>So that the nodes will be in a 'down' state atleast from the watchdog perspective (instead of SHUTDOWN)?</div><div><br></div><div>My end goal is to achieve auto-remediation through pgpool 4.1 as your post describes here:</div><div><a href="https://b-peng.blogspot.com/2022/02/auto-failback.html#:~:text=To%20use%20this%20automatic%20failback,9.1%20or%20later%20is%20required">https://b-peng.blogspot.com/2022/02/auto-failback.html#:~:text=To%20use%20this%20automatic%20failback,9.1%20or%20later%20is%20required</a>.</div><div><br></div><div>I am also considering a cronjob or similar to effect the 're-attaching' node back into the pool.</div><div><br clear="all"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><b>Thanks</b><br></div><i>Gopi</i><br></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, May 1, 2023 at 1:45 PM Bo Peng <<a href="mailto:pengbo@sraoss.co.jp">pengbo@sraoss.co.jp</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello,<br>

<br>

It is the feature Pgpool will immediately detect the failure and shutdown when network failure occurs.<br>

This is a solution to avoid split brain.<br>

<br>

And wd_interval and wd_heartbeat_deadtime don't affect this behavior.<br>

<br>

Could you try to disable wd_monitoring_interfaces_list and see if this issue is improved?<br>

<br>

  wd_monitoring_interfaces_list = ''<br>

<br>

On Mon, 1 May 2023 10:36:15 +0530<br>

Gopikrishnan <<a href="mailto:nksgopikrishnan@gmail.com" target="_blank">nksgopikrishnan@gmail.com</a>> wrote:<br>

<br>

> Hi all,<br>

> <br>

> Any pointers would be helpful.<br>

> Currently, the pgpool is highly unstable due to momentary network<br>

> disruptions.<br>

> <br>

> While it looks like pgpool was already built with that in mind,<br>

> It is not working correctly for me due to a bug or misconfiguration.<br>

> <br>

> *Thanks*<br>

> *Gopi*<br>

> <br>

> <br>

> On Wed, Apr 19, 2023 at 2:08 PM Gopikrishnan <<a href="mailto:nksgopikrishnan@gmail.com" target="_blank">nksgopikrishnan@gmail.com</a>><br>

> wrote:<br>

> <br>

> > Thank you for your response.<br>

> ><br>

> > Answers to your questions:<br>

> > 1. I am using pgpool 4.0.4<br>

> > 2. DEBUG was specifically enabled to debug this issue. (PCP commands<br>

> > frozen)<br>

> > 3. Yes, all the mentioned properties are enabled. (All the pgpool<br>

> > configurations are below for the reference)<br>

> ><br>

> > >If the network goes down, watchdog will detect the network failure and<br>

> > shutdown itself.<br>

> > >To avoid such problems, it is recommended to shutdown pgpool before<br>

> > restarting network.<br>

> ><br>

> > 1. The network was not down; It got restarted. i.e. it came back up in no<br>

> > time (Within seconds)<br>

> > In my current understanding, the watchdog settings: wd_interval and wd_life_point<br>

> > should have covered/tolerated this network downtime?<br>

> > 2. Most of the time in my prod environment, the restart or a glitch in<br>

> > network is not in the application control, to pre-emptively stop pgPool.<br>

> ><br>

> > My follow-up questions:<br>

> > -------<br>

> > 1. Am I hitting a bug in pgPool?<br>

> > 2. Is this scenario (Network glitch) handled better in a newer PgPool<br>

> > version? (So that I can upgrade, if possible with minimal changes to the<br>

> > confs.)<br>

> > -------------------------<br>

> > allow_clear_text_frontend_auth = off<br>

> > allow_multiple_failover_requests_from_node = off<br>

> > allow_sql_comments = off<br>

> > app_name_redirect_preference_list = ''<br>

> > arping_cmd = 'arping -U $_IP_$ -w 1'<br>

> > arping_path = '/sbin'<br>

> > authentication_timeout = 60<br>

> > backend_data_directory0 = '/db/data'<br>

> > backend_data_directory1 = '/db/data'<br>

> > backend_data_directory2 = '/db/data'<br>

> > backend_flag0 = 'ALLOW_TO_FAILOVER'<br>

> > backend_flag1 = 'ALLOW_TO_FAILOVER'<br>

> > backend_flag2 = 'ALLOW_TO_FAILOVER'<br>

> > backend_hostname0 = '10.108.104.31'<br>

> > backend_hostname1 = '10.108.104.32'<br>

> > backend_hostname2 = '10.108.104.33'<br>

> > backend_port0 = 5432<br>

> > backend_port1 = 5432<br>

> > backend_port2 = 5432<br>

> > backend_weight0 = 1<br>

> > backend_weight1 = 1<br>

> > backend_weight2 = 1<br>

> > black_function_list = 'currval,lastval,nextval,setval'<br>

> > black_memqcache_table_list = ''<br>

> > black_query_pattern_list = ''<br>

> > check_temp_table = on<br>

> > check_unlogged_table = on<br>

> > child_life_time = 300<br>

> > child_max_connections = 0<br>

> > clear_memqcache_on_escalation = on<br>

> > client_idle_limit = 0<br>

> > client_idle_limit_in_recovery = 0<br>

> > connect_timeout = 10000<br>

> > connection_cache = on<br>

> > connection_life_time = 0<br>

> > database_redirect_preference_list = ''<br>

> > delay_threshold = 10000000<br>

> > delegate_IP = ''<br>

> > detach_false_primary = off<br>

> > disable_load_balance_on_write = 'transaction'<br>

> > enable_pool_hba = off<br>

> > failback_command = ''<br>

> > failover_command = '/usr/local/etc/failover.sh %d %h %p %D %m %H %M %P %r<br>

> > %R'<br>

> > failover_if_affected_tuples_mismatch = off<br>

> > failover_on_backend_error = on<br>

> > failover_require_consensus = on<br>

> > failover_when_quorum_exists = on<br>

> > follow_master_command = '/usr/local/etc/follow_master.sh %d %h %p %D %m %M<br>

> > %H %P %r %R'<br>

> > health_check_database = ''<br>

> > health_check_max_retries = 3<br>

> > health_check_password = 'e2f2da4a027a41bf8517406dd9ca970e'<br>

> > health_check_period = 5<br>

> > health_check_retry_delay = 1<br>

> > health_check_timeout = 30<br>

> > health_check_user = 'pgpool'<br>

> > heartbeat_destination0 = '10.108.104.32'<br>

> > heartbeat_destination1 = '10.108.104.33'<br>

> > heartbeat_destination_port0 = 9694<br>

> > heartbeat_destination_port1 = 9694<br>

> > heartbeat_device0 = ''<br>

> > heartbeat_device1 = ''<br>

> > if_down_cmd = ''<br>

> > if_up_cmd = ''<br>

> > ifconfig_path = '/sbin'<br>

> > ignore_leading_white_space = on<br>

> > insert_lock = off<br>

> > listen_addresses = '*'<br>

> > listen_backlog_multiplier = 2<br>

> > load_balance_mode = on<br>

> > lobj_lock_table = ''<br>

> > log_client_messages = off<br>

> > log_connections = off<br>

> > log_destination = 'syslog'<br>

> > log_hostname = off<br>

> > log_line_prefix = '%t: pid %p: '<br>

> > log_per_node_statement = off<br>

> > log_standby_delay = 'if_over_threshold'<br>

> > log_statement = off<br>

> > logdir = '/tmp'<br>

> > master_slave_mode = on<br>

> > master_slave_sub_mode = 'stream'<br>

> > max_pool = 4<br>

> > memory_cache_enabled = off<br>

> > memqcache_auto_cache_invalidation = on<br>

> > memqcache_cache_block_size = 1048576<br>

> > memqcache_expire = 0<br>

> > memqcache_max_num_cache = 1000000<br>

> > memqcache_maxcache = 409600<br>

> > memqcache_memcached_host = 'localhost'<br>

> > memqcache_memcached_port = 11211<br>

> > memqcache_method = 'shmem'<br>

> > memqcache_oiddir = '/var/log/pgpool/oiddir'<br>

> > memqcache_total_size = 67108864<br>

> > num_init_children = 32<br>

> > other_pgpool_hostname0 = '10.108.104.32'<br>

> > other_pgpool_hostname1 = '10.108.104.33'<br>

> > other_pgpool_port0 = 9999<br>

> > other_pgpool_port1 = 9999<br>

> > other_wd_port0 = 9000<br>

> > other_wd_port1 = 9000<br>

> > pcp_listen_addresses = '*'<br>

> > pcp_port = 9898<br>

> > pcp_socket_dir = '/tmp'<br>

> > pid_file_name = '/var/run/pgpool/pgpool.pid'<br>

> > ping_path = '/bin'<br>

> > pool_passwd = 'pool_passwd'<br>

> > port = 9999<br>

> > recovery_1st_stage_command = 'recovery_1st_stage'<br>

> > recovery_2nd_stage_command = ''<br>

> > recovery_password = 'ZPH3Xnuh8ISKMZjSqLvIBQe_WTOzXbPF'<br>

> > recovery_timeout = 90<br>

> > recovery_user = 'postgres'<br>

> > relcache_expire = 0<br>

> > relcache_size = 256<br>

> > replicate_select = off<br>

> > replication_mode = off<br>

> > replication_stop_on_mismatch = off<br>

> > reset_query_list = 'ABORT; DISCARD ALL'<br>

> > search_primary_node_timeout = 300<br>

> > serialize_accept = off<br>

> > socket_dir = '/var/run/pgpool/socket'<br>

> > sr_check_database = 'postgres'<br>

> > sr_check_password = 'e2f2da4a027a41bf8517406dd9ca970e'<br>

> > sr_check_period = 10<br>

> > sr_check_user = 'pgpool'<br>

> > ssl = off<br>

> > ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL'<br>

> > ssl_prefer_server_ciphers = off<br>

> > syslog_facility = 'LOCAL1'<br>

> > syslog_ident = 'pgpool'<br>

> > trusted_servers = ''<br>

> > use_watchdog = on<br>

> > wd_authkey = ''<br>

> > wd_de_escalation_command = '/usr/local/etc/desc.sh'<br>

> > wd_escalation_command = '/usr/local/etc/esc.sh'<br>

> > wd_heartbeat_deadtime = 30<br>

> > wd_heartbeat_keepalive = 2<br>

> > wd_heartbeat_port = 9694<br>

> > wd_hostname = '10.108.104.31'<br>

> > wd_interval = 10<br>

> > wd_ipc_socket_dir = '/tmp'<br>

> > wd_life_point = 3<br>

> > wd_lifecheck_dbname = 'template1'<br>

> > wd_lifecheck_method = 'heartbeat'<br>

> > wd_lifecheck_password = ''<br>

> > wd_lifecheck_query = 'SELECT 1'<br>

> > wd_lifecheck_user = 'nobody'<br>

> > wd_monitoring_interfaces_list = 'any'<br>

> > wd_port = 9000<br>

> > wd_priority = 1<br>

> > white_function_list = ''<br>

> > white_memqcache_table_list = ''<br>

> ><br>

> ><br>

> ><br>

> ><br>

> ><br>

> > *Thanks*<br>

> > *Gopi*<br>

> ><br>

> ><br>

> > On Wed, Apr 19, 2023 at 1:13 PM Bo Peng <<a href="mailto:pengbo@sraoss.co.jp" target="_blank">pengbo@sraoss.co.jp</a>> wrote:<br>

> ><br>

> >> Hello,<br>

> >><br>

> >> > When I do:<br>

> >> > systemctl restart systemd-networkd<br>

> >> ><br>

> >> > After that, I am not able to execute any PCP commands like:<br>

> >> > pcp_watchdog_info<br>

> >> > It is frozen.<br>

> >><br>

> >> If the network goes down, watchdog will detect the network failure and<br>

> >> shutdown itself.<br>

> >> To avoid such problems, it is recommended to shutdown pgpool before<br>

> >> restarting network.<br>

> >><br>

> >> BTW, which version of Pgpool-II are you using?<br>

> >><br>

> >> > I tried restarting pgpool and postgres to no avail.<br>

> >> > However, rebooting the system gets it back to a workable state. (PCP<br>

> >> > commands are running again and I can attach the nodes back to the pool)<br>

> >> ><br>

> >> > The pgPool logs shows that the pg-pool was shutdown due to the network<br>

> >> > event:<br>

> >> > -------------------------------<br>

> >> ><br>

> >> > 2023-04-17T15:27:25.190949+00:00 vmvrlcm-104-32 g[3042]: [268-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DEBUG:  network event received<br>

> >> > 2023-04-17T15:27:25.191041+00:00 vmvrlcm-104-32 g[3042]: [268-2]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DETAIL:  deleted = YES Link change event = NO<br>

> >> > 2023-04-17T15:27:25.191186+00:00 vmvrlcm-104-32 g[3042]: [269-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DEBUG:  STATE MACHINE INVOKED WITH EVENT = NETWORK<br>

> >> IP<br>

> >> > IS REMOVED Current State = STANDBY<br>

> >> > 2023-04-17T15:27:25.191243+00:00 vmvrlcm-104-32 g[3042]: [270-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DEBUG:  network interface lo having flags 65609<br>

> >> > 2023-04-17T15:27:25.191296+00:00 vmvrlcm-104-32 g[3042]: [271-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DEBUG:  network interface eth0 having flags 69699<br>

> >> > 2023-04-17T15:27:25.191352+00:00 vmvrlcm-104-32 g[3042]: [272-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DEBUG:  network interface "eth0" link is active<br>

> >> > 2023-04-17T15:27:25.191401+00:00 vmvrlcm-104-32 g[3042]: [273-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DEBUG:  network interface "eth0" link is up<br>

> >> > 2023-04-17T15:27:25.191449+00:00 vmvrlcm-104-32 g[3042]: [274-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DEBUG:  network interface lo having flags 65609<br>

> >> > 2023-04-17T15:27:25.191497+00:00 vmvrlcm-104-32 g[3042]: [275-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DEBUG:  network interface "eth0" is up and we can<br>

> >> > continue<br>

> >> > 2023-04-17T15:27:25.191551+00:00 vmvrlcm-104-32 g[3042]: [276-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: WARNING:  network IP is removed and system has no<br>

> >> IP is<br>

> >> > assigned<br>

> >> > 2023-04-17T15:27:25.191614+00:00 vmvrlcm-104-32 g[3042]: [276-2]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DETAIL:  changing the state to in network trouble<br>

> >> > 2023-04-17T15:27:25.191667+00:00 vmvrlcm-104-32 g[3042]: [277-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: LOG:  watchdog node state changed from [STANDBY] to<br>

> >> [IN<br>

> >> > NETWORK TROUBLE]<br>

> >> > 2023-04-17T15:27:25.191713+00:00 vmvrlcm-104-32 g[3042]: [278-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DEBUG:  STATE MACHINE INVOKED WITH EVENT = STATE<br>

> >> > CHANGED Current State = IN NETWORK TROUBLE<br>

> >> > 2023-04-17T15:27:25.191759+00:00 vmvrlcm-104-32 g[3042]: [279-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: FATAL:  system has lost the network<br>

> >> > 2023-04-17T15:27:25.191807+00:00 vmvrlcm-104-32 g[3042]: [280-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: LOG:  Watchdog is shutting down<br>

> >> > 2023-04-17T15:27:25.191849+00:00 vmvrlcm-104-32 g[3042]: [281-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DEBUG:  sending packet, watchdog node:[<br>

> >> > <a href="http://vmvrlcm-104-31.eng.vmware.com:9999" rel="noreferrer" target="_blank">vmvrlcm-104-31.eng.vmware.com:9999</a> Linux <a href="http://vmvrlcm-104-31.eng.vmware.com" rel="noreferrer" target="_blank">vmvrlcm-104-31.eng.vmware.com</a>]<br>

> >> > command id:[10] type:[INFORM I AM GOING DOWN] state:[IN NETWORK TROUBLE]<br>

> >> > 2023-04-17T15:27:25.191894+00:00 vmvrlcm-104-32 g[3042]: [282-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DEBUG:  sending watchdog packet to socket:8,<br>

> >> type:[X],<br>

> >> > command ID:10, data Length:0<br>

> >> > 2023-04-17T15:27:25.191952+00:00 vmvrlcm-104-32 g[3042]: [283-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DEBUG:  sending packet, watchdog node:[<br>

> >> > <a href="http://vmvrlcm-104-33.eng.vmware.com:9999" rel="noreferrer" target="_blank">vmvrlcm-104-33.eng.vmware.com:9999</a> Linux <a href="http://vmvrlcm-104-33.eng.vmware.com" rel="noreferrer" target="_blank">vmvrlcm-104-33.eng.vmware.com</a>]<br>

> >> > command id:[10] type:[INFORM I AM GOING DOWN] state:[IN NETWORK TROUBLE]<br>

> >> > 2023-04-17T15:27:25.192001+00:00 vmvrlcm-104-32 g[3042]: [284-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3042: DEBUG:  sending watchdog packet to socket:9,<br>

> >> type:[X],<br>

> >> > command ID:10, data Length:0<br>

> >> > 2023-04-17T15:27:25.192671+00:00 vmvrlcm-104-32 pgpool[3040]: [24-1]<br>

> >> > 2023-04-17 15:27:25: pid 3040: DEBUG:  reaper handler<br>

> >> > 2023-04-17T15:27:25.192753+00:00 vmvrlcm-104-32 pgpool[3040]: [25-1]<br>

> >> > 2023-04-17 15:27:25: pid 3040: DEBUG:  watchdog child process with pid:<br>

> >> > 3042 exit with FATAL ERROR. pgpool-II will be shutdown<br>

> >> > 2023-04-17T15:27:25.192803+00:00 vmvrlcm-104-32 pgpool[3040]: [26-1]<br>

> >> > 2023-04-17 15:27:25: pid 3040: LOG:  watchdog child process with pid:<br>

> >> 3042<br>

> >> > exits with status 768<br>

> >> > 2023-04-17T15:27:25.192864+00:00 vmvrlcm-104-32 pgpool[3040]: [27-1]<br>

> >> > 2023-04-17 15:27:25: pid 3040: FATAL:  watchdog child process exit with<br>

> >> > fatal error. exiting pgpool-II<br>

> >> > 2023-04-17T15:27:25.197530+00:00 vmvrlcm-104-32 ck[3157]: [23-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3157: DEBUG:  lifecheck child receives shutdown request<br>

> >> > signal 2, forwarding to all children<br>

> >> > 2023-04-17T15:27:25.197611+00:00 vmvrlcm-104-32 ck[3157]: [24-1]<br>

> >> 2023-04-17<br>

> >> > 15:27:25: pid 3157: DEBUG:  lifecheck child receives fast shutdown<br>

> >> request<br>

> >> > 2023-04-17T15:27:25.197658+00:00 vmvrlcm-104-32 at sender[3159]: [148-1]<br>

> >> > 2023-04-17 15:27:25: pid 3159: DEBUG:  watchdog heartbeat sender child<br>

> >> > receives shutdown request signal 2<br>

> >> > 2023-04-17T15:27:25.197994+00:00 vmvrlcm-104-32 at sender[3163]: [148-1]<br>

> >> > 2023-04-17 15:27:25: pid 3163: DEBUG:  watchdog heartbeat sender child<br>

> >> > receives shutdown request signal 2<br>

> >> > 2023-04-17T15:27:25.199168+00:00 vmvrlcm-104-32 at receiver[3161]:<br>

> >> [18-1]<br>

> >> > 2023-04-17 15:27:25: pid 3161: DEBUG:  watchdog heartbeat receiver child<br>

> >> > receives shutdown request signal 2<br>

> >> > 2023-04-17T15:27:25.199567+00:00 vmvrlcm-104-32 at receiver[3158]:<br>

> >> [18-1]<br>

> >> > 2023-04-17 15:27:25: pid 3158: DEBUG:  watchdog heartbeat receiver child<br>

> >> > receives shutdown request signal 2<br>

> >> > 2023-04-17T15:27:25.448554+00:00 vmvrlcm-104-32 check process(2)[3197]:<br>

> >> > [386-1] 2023-04-17 15:27:25: pid 3197: DEBUG:  health check: clearing<br>

> >> alarm<br>

> >> > 2023-04-17T15:27:25.448689+00:00 vmvrlcm-104-32 check process(2)[3197]:<br>

> >> > [387-1] 2023-04-17 15:27:25: pid 3197: DEBUG:  SSL is requested but SSL<br>

> >> > support is not available<br>

> >> > 2023-04-17T15:27:25.450621+00:00 vmvrlcm-104-32 check process(2)[3197]:<br>

> >> > [388-1] 2023-04-17 15:27:25: pid 3197: DEBUG:  authenticate kind = 5<br>

> >> > 2023-04-17T15:27:25.451892+00:00 vmvrlcm-104-32 check process(2)[3197]:<br>

> >> > [389-1] 2023-04-17 15:27:25: pid 3197: DEBUG:  authenticate backend: key<br>

> >> > data received<br>

> >> > 2023-04-17T15:27:25.451987+00:00 vmvrlcm-104-32 check process(2)[3197]:<br>

> >> > [390-1] 2023-04-17 15:27:25: pid 3197: DEBUG:  authenticate backend:<br>

> >> > transaction state: I<br>

> >> > 2023-04-17T15:27:25.452043+00:00 vmvrlcm-104-32 check process(2)[3197]:<br>

> >> > [391-1] 2023-04-17 15:27:25: pid 3197: DEBUG:  health check: clearing<br>

> >> alarm<br>

> >> > 2023-04-17T15:27:25.452096+00:00 vmvrlcm-104-32 check process(2)[3197]:<br>

> >> > [392-1] 2023-04-17 15:27:25: pid 3197: DEBUG:  health check: clearing<br>

> >> alarm<br>

> >> > 2023-04-17T15:27:25.455020+00:00 vmvrlcm-104-32 check process(0)[3196]:<br>

> >> > [386-1] 2023-04-17 15:27:25: pid 3196: DEBUG:  health check: clearing<br>

> >> alarm<br>

> >> > 2023-04-17T15:27:25.455096+00:00 vmvrlcm-104-32 check process(0)[3196]:<br>

> >> > [387-1] 2023-04-17 15:27:25: pid 3196: DEBUG:  SSL is requested but SSL<br>

> >> > support is not available<br>

> >> > 2023-04-17T15:27:25.457196+00:00 vmvrlcm-104-32 check process(0)[3196]:<br>

> >> > [388-1] 2023-04-17 15:27:25: pid 3196: DEBUG:  authenticate kind = 5<br>

> >> > 2023-04-17T15:27:25.458437+00:00 vmvrlcm-104-32 check process(0)[3196]:<br>

> >> > [389-1] 2023-04-17 15:27:25: pid 3196: DEBUG:  authenticate backend: key<br>

> >> > data received<br>

> >> > 2023-04-17T15:27:25.458556+00:00 vmvrlcm-104-32 check process(0)[3196]:<br>

> >> > [390-1] 2023-04-17 15:27:25: pid 3196: DEBUG:  authenticate backend:<br>

> >> > transaction state: I<br>

> >> > 2023-04-17T15:27:25.458674+00:00 vmvrlcm-104-32 check process(0)[3196]:<br>

> >> > [391-1] 2023-04-17 15:27:25: pid 3196: DEBUG:  health check: clearing<br>

> >> alarm<br>

> >> > 2023-04-17T15:27:25.458742+00:00 vmvrlcm-104-32 check process(0)[3196]:<br>

> >> > [392-1] 2023-04-17 15:27:25: pid 3196: DEBUG:  health check: clearing<br>

> >> alarm<br>

> >> > 2023-04-17T15:27:30.452427+00:00 vmvrlcm-104-32 check process(2)[3197]:<br>

> >> > [393-1] 2023-04-17 15:27:30: pid 3197: DEBUG:  health check: clearing<br>

> >> alarm<br>

> >> > 2023-04-17T15:27:30.454041+00:00 vmvrlcm-104-32 check process(2)[3197]:<br>

> >> > [394-1] 2023-04-17 15:27:30: pid 3197: DEBUG:  SSL is requested but SSL<br>

> >> > support is not available<br>

> >> ><br>

> >> > ------------------<br>

> >> ><br>

> >> > After this, it is in kind of a loop of the 'clearing alarm' + 'SSL<br>

> >> support<br>

> >> > is not available'<br>

> >> ><br>

> >> > The relevant (In my current understanding) watchdog settings are:<br>

> >> ><br>

> >> ----------------------------------------------------------------------------------------<br>

> >> > wd_hostname = '10.108.104.31'<br>

> >> > wd_lifecheck_method = 'heartbeat'<br>

> >> > wd_interval = 10<br>

> >> > wd_heartbeat_keepalive = 2<br>

> >> > wd_heartbeat_deadtime = 30<br>

> >> > heartbeat_destination0 = '10.108.104.32'<br>

> >> > heartbeat_device0 = ''<br>

> >> > heartbeat_destination1 = '10.108.104.33'<br>

> >> > heartbeat_device1 = ''<br>

> >> > wd_monitoring_interfaces_list = 'any'<br>

> >><br>

> >> Above logs are DEBUG messages and I don't think they caused this issue.<br>

> >> Do these DEBUG messages only appear when you restart the network?<br>

> >><br>

> >> If you are using watchdog, you also need to configure the following<br>

> >> parameters:<br>

> >><br>

> >> heartbeat_destination_port0<br>

> >> heartbeat_destination_port1<br>

> >> other_pgpool_hostname0<br>

> >> other_pgpool_port0<br>

> >> other_pgpool_hostname1<br>

> >> other_pgpool_port1<br>

> >><br>

> >> Regards,<br>

> >><br>

> >> --<br>

> >> Bo Peng <<a href="mailto:pengbo@sraoss.co.jp" target="_blank">pengbo@sraoss.co.jp</a>><br>

> >> SRA OSS LLC<br>

> >> <a href="https://www.sraoss.co.jp/" rel="noreferrer" target="_blank">https://www.sraoss.co.jp/</a><br>

> >><br>

> ><br>

<br>

<br>

-- <br>

Bo Peng <<a href="mailto:pengbo@sraoss.co.jp" target="_blank">pengbo@sraoss.co.jp</a>><br>

SRA OSS LLC<br>

<a href="https://www.sraoss.co.jp/" rel="noreferrer" target="_blank">https://www.sraoss.co.jp/</a><br>

</blockquote></div>