<div dir="auto"><div>Yeah, you are right. <div dir="auto">One more question: in pg_enc documentation there mentioned that it takes a encryption key by default at /root/.pgpoolkey (with permissionts 0600). But pgpool started by pgpool.service as a user postgres. </div><div dir="auto"><br></div><div dir="auto">TBH, I don't understand how this case handled.</div><div dir="auto"><br></div>BR</div><div dir="auto">Igor Yurchenko<br><br><div class="gmail_quote" dir="auto"><div dir="ltr" class="gmail_attr">On Tue, 16 Jul 2024, 3:27 Tatsuo Ishii, <<a href="mailto:ishii@postgresql.org">ishii@postgresql.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">> Hi Tatsuo<br>

> <br>

> Yeah, it was the reason. Great thanks!!!<br>

> Removing the /var/log/pgpool_status solves the problem.<br>

<br>

Glad to hear that!<br>

<br>

> The only remark that as I understand, (at least on pgpool-4.2) a pcp<br>

> port/socket becomes available only after successful backend<br>

> initialization. So, in my case without the initialized primary backend<br>

> all pcp commands didn't work.<br>

<br>

More precisely, pcp commands are available after searching the primary<br>

node is done.  In your case, until the operation done or<br>

search_primary_node_timeout passed, pcp commands are not available<br>

because there's no primary node at the moment.  Since the default<br>

value of search_primary_node_timeout is 5 minutes, you needed to wait<br>

for 5 minutes before pcp commands are available.  You can make<br>

search_primary_node_timeout shorter if you like.<br>

<br>

Best reagards,<br>

--<br>

Tatsuo Ishii<br>

SRA OSS LLC<br>

English: <a href="http://www.sraoss.co.jp/index_en/" rel="noreferrer noreferrer" target="_blank">http://www.sraoss.co.jp/index_en/</a><br>

Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>

<br>

> BR<br>

> Igor Yurchenko<br>

> <br>

> <br>

> On Mon, 15 Jul 2024 at 06:51, Tatsuo Ishii <<a href="mailto:ishii@postgresql.org" target="_blank" rel="noreferrer">ishii@postgresql.org</a>> wrote:<br>

> <br>

>> Hi,<br>

>><br>

>> My guess is, pgpool_status (should be placed under logdir) file is set<br>

>> to:<br>

>><br>

>> down<br>

>> up<br>

>><br>

>> i.e. pgpool thinks that the PostgreSQL primary node is detached from<br>

>> pgpool. If you are sure that the PostgreSQL primary is healthy, you<br>

>> can attach the primary node by using pcp_attached_node. Alternative<br>

>> method is, stopping pgpool; removing the pgpool_status file;<br>

>> restarting pgpool. pgpool will automatically create pgpool_status file<br>

>> and mark the primary up if it is up and running.<br>

>><br>

>> Best reagards,<br>

>> --<br>

>> Tatsuo Ishii<br>

>> SRA OSS LLC<br>

>> English: <a href="http://www.sraoss.co.jp/index_en/" rel="noreferrer noreferrer" target="_blank">http://www.sraoss.co.jp/index_en/</a><br>

>> Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>

>><br>

>><br>

>> > Hi guys<br>

>> > I have given up and it looks like I need your help.<br>

>> ><br>

>> > A setup:<br>

>> > I have a setup with 2 hosts of Postgresql15 stream cluster (IPs:<br>

>> > 10.65.188.55, 10.65.188.56 respectively) with a 2 Pgpool 4.2.15 instances<br>

>> > installed in the same hosts.<br>

>> > The pgpool instances manage/delegate on a VIP 10.65.188.59.<br>

>> ><br>

>> > A replication between pg15 instances work well and the instances recovery<br>

>> > status detected reliably:<br>

>> ><br>

>> > [root@pg-mgrdb1 pgpool-II]#  PGPASSWORD=fabrix /usr/pgsql-15/bin/psql -U<br>

>> > fabrix -h 10.65.188.55 -p 5432 template1 -c 'select pg_is_in_recovery()'<br>

>> >  pg_is_in_recovery<br>

>> > -------------------<br>

>> >  f<br>

>> > (1 row)<br>

>> ><br>

>> > [root@pg-mgrdb1 pgpool-II]#  PGPASSWORD=fabrix /usr/pgsql-15/bin/psql -U<br>

>> > fabrix -h 10.65.188.56 -p 5432 template1 -c 'select pg_is_in_recovery()'<br>

>> >  pg_is_in_recovery<br>

>> > -------------------<br>

>> >  t<br>

>> > (1 row)<br>

>> ><br>

>> > [root@pg-mgrdb1 pgpool-II]#  PGPASSWORD=fabrix /usr/pgsql-15/bin/psql -U<br>

>> > fabrix -h 10.65.188.55 -p 5432 template1 -c 'select * from<br>

>> > pg_stat_replication'<br>

>> >   pid  | usesysid | usename | application_name | client_addr  |<br>

>> > client_hostname | client_port |         backend_start         |<br>

>> > backend_xmin |   state   | sent_lsn  | write_lsn | flush_lsn |<br>

>> replay_lsn |<br>

>> > write_lag | f<br>

>> > lush_lag | replay_lag | sync_priority | sync_state |          reply_time<br>

>> ><br>

>> -------+----------+---------+------------------+--------------+-----------------+-------------+-------------------------------+--------------+-----------+-----------+-----------+-----------+------------+-----------+--<br>

>> ><br>

>> ---------+------------+---------------+------------+-------------------------------<br>

>> >  13875 |    16384 | fabrix  | walreceiver      | 10.65.188.56 |<br>

>> >     |       47636 | 2024-07-14 14:30:54.344682+03 |          743 |<br>

>> > streaming | 0/B0001C0 | 0/B0001C0 | 0/B0001C0 | 0/B0001C0  |           |<br>

>> >          |            |             0 | async      | 2024-07-14<br>

>> > 14:48:41.550424+03<br>

>> > (1 row)<br>

>> ><br>

>> > [root@pg-mgrdb1 pgpool-II]#<br>

>> ><br>

>> > The databases of the pg15 cluster are reachable/available via VIP:<br>

>> ><br>

>> > [root@pg-mgrdb1 pgpool-II]#<br>

>> > [root@pg-mgrdb1 pgpool-II]#  PGPASSWORD=fabrix /usr/pgsql-15/bin/psql -U<br>

>> > fabrix -h 10.65.188.59 -p 9999 manager -c '\dt'<br>

>> >                  List of relations<br>

>> >  Schema |         Name          | Type  |  Owner<br>

>> > --------+-----------------------+-------+----------<br>

>> >  public | upgrade_version_table | table | postgres<br>

>> > (1 row)<br>

>> ><br>

>> > [root@pg-mgrdb1 pgpool-II]#  PGPASSWORD=fabrix /usr/pgsql-15/bin/psql -U<br>

>> > fabrix -h 10.65.188.59 -p 9999 manager -c 'select * from<br>

>> > upgrade_version_table'<br>

>> >     app_name    | cur_version | upgrade_to<br>

>> > ----------------+-------------+------------<br>

>> >  brODtv Manager | 10.2.0.300  |<br>

>> > (1 row)<br>

>> ><br>

>> > [root@pg-mgrdb1 pgpool-II]#<br>

>> ><br>

>> > But, the pgpool cannot detect the cluster primary node, and provides<br>

>> > incorrect pool_nodes info:<br>

>> ><br>

>> > [root@pg-mgrdb1 pgpool-II]#  PGPASSWORD=fabrix /usr/pgsql-15/bin/psql -U<br>

>> > fabrix -h 10.65.188.59 -p 9999 manager -c 'show pool_nodes'<br>

>> >  node_id |   hostname   | port | status | lb_weight |  role   |<br>

>> select_cnt<br>

>> > | load_balance_node | replication_delay | replication_state |<br>

>> > replication_sync_state | last_status_change<br>

>> ><br>

>> ---------+--------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------<br>

>> >  0       | 10.65.188.55 | 5432 | down   | 0.500000  | standby | 0<br>

>> >  | false             | 0                 |                   |<br>

>> >            | 2024-07-14 15:04:01<br>

>> >  1       | 10.65.188.56 | 5432 | up     | 0.500000  | standby | 3<br>

>> >  | true              | 0                 |                   |<br>

>> >            | 2024-07-14 15:04:01<br>

>> > (2 rows)<br>

>> ><br>

>> > [root@pg-mgrdb1 pgpool-II]#<br>

>> ><br>

>> > Right now, I'm testing it with a "single pgpool up" configuration (pgpool<br>

>> > on the second host is down). The  pgpool log shows that the pgpool<br>

>> doesn't<br>

>> > even try to touch the first (current primary) backend at<br>

>> <a href="http://10.65.188.55:5432" rel="noreferrer noreferrer" target="_blank">10.65.188.55:5432</a>.<br>

>> > It successfully connects to the second 'standby' node and detects its<br>

>> > status. And this second node handles successfully all SELECT queries due<br>

>> to<br>

>> > load balancing configured. But what's wrong with first primary node?<br>

>> ><br>

>> > In the pgpool log I don't see any tries to connect to the first node. The<br>

>> > piece of the log where "find_primary_node" mentioned first looks so:<br>

>> ><br>

>> > 2024-07-14 15:23:38: pid 18723: LOG:  2 watchdog nodes are configured for<br>

>> > lifecheck<br>

>> > 2024-07-14 15:23:38: pid 18723: LOCATION:  wd_lifecheck.c:495<br>

>> > 2024-07-14 15:23:38: pid 18723: LOG:  watchdog nodes ID:0 Name:"<br>

>> > <a href="http://10.65.188.55:9999" rel="noreferrer noreferrer" target="_blank">10.65.188.55:9999</a> Linux pg-mgrdb1"<br>

>> > 2024-07-14 15:23:38: pid 18723: DETAIL:  Host:"10.65.188.55" WD Port:9000<br>

>> > pgpool-II port:9999<br>

>> > 2024-07-14 15:23:38: pid 18723: LOCATION:  wd_lifecheck.c:503<br>

>> > 2024-07-14 15:23:38: pid 18723: LOG:  watchdog nodes ID:1 Name:"Not_Set"<br>

>> > 2024-07-14 15:23:38: pid 18723: DETAIL:  Host:"10.65.188.56" WD Port:9000<br>

>> > pgpool-II port:9999<br>

>> > 2024-07-14 15:23:38: pid 18723: LOCATION:  wd_lifecheck.c:503<br>

>> > 2024-07-14 15:23:38: pid 18706: LOG:  find_primary_node_repeatedly:<br>

>> waiting<br>

>> > for finding a primary node<br>

>> > 2024-07-14 15:23:38: pid 18706: LOCATION:  pgpool_main.c:3404<br>

>> > 2024-07-14 15:23:38: pid 18706: LOG:  find_primary_node: standby node is<br>

>> 1<br>

>> > 2024-07-14 15:23:38: pid 18706: LOCATION:  pgpool_main.c:3329<br>

>> > 2024-07-14 15:23:39: pid 18729: LOG:  set SO_REUSEPORT option to the<br>

>> socket<br>

>> > 2024-07-14 15:23:39: pid 18729: LOCATION:  wd_heartbeat.c:691<br>

>> > 2024-07-14 15:23:39: pid 18729: LOG:  creating watchdog heartbeat receive<br>

>> > socket.<br>

>> > 2024-07-14 15:23:39: pid 18729: DETAIL:  set SO_REUSEPORT<br>

>> > 2024-07-14 15:23:39: pid 18729: LOCATION:  wd_heartbeat.c:231<br>

>> > 2024-07-14 15:23:39: pid 18731: LOG:  set SO_REUSEPORT option to the<br>

>> socket<br>

>> > 2024-07-14 15:23:39: pid 18731: LOCATION:  wd_heartbeat.c:691<br>

>> > 2024-07-14 15:23:39: pid 18731: LOG:  creating socket for sending<br>

>> heartbeat<br>

>> > 2024-07-14 15:23:39: pid 18731: DETAIL:  set SO_REUSEPORT<br>

>> > 2024-07-14 15:23:39: pid 18731: LOCATION:  wd_heartbeat.c:148<br>

>> > 2024-07-14 15:23:39: pid 18706: LOG:  find_primary_node: standby node is<br>

>> 1<br>

>> > 2024-07-14 15:23:39: pid 18706: LOCATION:  pgpool_main.c:3329<br>

>> > 2024-07-14 15:23:40: pid 18706: LOG:  find_primary_node: standby node is<br>

>> 1<br>

>> > 2024-07-14 15:23:40: pid 18706: LOCATION:  pgpool_main.c:3329<br>

>> > 2024-07-14 15:23:41: pid 18706: LOG:  find_primary_node: standby node is<br>

>> 1<br>

>> > 2024-07-14 15:23:41: pid 18706: LOCATION:  pgpool_main.c:3329<br>

>> ><br>

>> > So I have no ideas anymore.<br>

>> ><br>

>> > The cleaned from comments and  sorted pgpool.conf looks so:<br>

>> ><br>

>> *********************************************************************************************<br>

>> > allow_clear_text_frontend_auth=off<br>

>> > allow_multiple_failover_requests_from_node=off<br>

>> > allow_sql_comments=off<br>

>> > app_name_redirect_preference_list=''<br>

>> > arping_cmd='/usr/bin/sudo /usr/sbin/arping -U $_IP_$ -w 1 -I eth0'<br>

>> > arping_path='/usr/sbin'<br>

>> > authentication_timeout=1min<br>

>> > auto_failback_interval=1min<br>

>> > auto_failback=off<br>

>> > backend_application_name0=mgrdb1<br>

>> > backend_application_name1=mgrdb2<br>

>> > backend_clustering_mode='streaming_replication'<br>

>> > backend_data_directory0='/home/postgres/databases/fabrix2'<br>

>> > backend_data_directory1='/home/postgres/databases/fabrix2'<br>

>> > backend_flag0='ALLOW_TO_FAILOVER'<br>

>> > backend_flag1='ALLOW_TO_FAILOVER'<br>

>> > backend_hostname0='10.65.188.55'<br>

>> > backend_hostname1='10.65.188.56'<br>

>> > backend_port0=5432<br>

>> > backend_port1=5432<br>

>> > backend_weight0=1<br>

>> > backend_weight1=1<br>

>> > black_function_list='currval,lastval,nextval,setval'<br>

>> > cache_safe_memqcache_table_list=''<br>

>> > cache_unsafe_memqcache_table_list=''<br>

>> > check_temp_table=catalog<br>

>> > check_unlogged_table=on<br>

>> > child_life_time=1<br>

>> > child_max_connections=0<br>

>> > clear_memqcache_on_escalation=on<br>

>> > client_idle_limit=0<br>

>> > client_idle_limit_in_recovery=0<br>

>> > client_min_messages = debug1<br>

>> > connection_cache=on<br>

>> > connection_life_time=0<br>

>> > connect_timeout=10000<br>

>> > database_redirect_preference_list=''<br>

>> > delay_threshold=10000000<br>

>> > delegate_IP='10.65.188.59'<br>

>> > detach_false_primary=off<br>

>> > disable_load_balance_on_write='transaction'<br>

>> > dml_adaptive_object_relationship_list=''<br>

>> > enable_consensus_with_half_votes=on<br>

>> > enable_pool_hba=on<br>

>> > enable_shared_relcache=on<br>

>> > failback_command=''<br>

>> > failover_command='/etc/pgpool-II/recovery/failover.sh %d %P %H %R'<br>

>> > failover_if_affected_tuples_mismatch=off<br>

>> > failover_on_backend_error=on<br>

>> > failover_require_consensus=on<br>

>> > failover_when_quorum_exists=off<br>

>> > follow_master_command=''<br>

>> > follow_primary_command=''<br>

>> > health_check_database=''<br>

>> > health_check_max_retries=0<br>

>> > health_check_password='***************'<br>

>> > health_check_period=1<br>

>> > health_check_retry_delay=1<br>

>> > health_check_timeout=0<br>

>> > health_check_user='postgres'<br>

>> > heartbeat_device0='eth0'<br>

>> > heartbeat_device1='eth0'<br>

>> > heartbeat_hostname0='10.65.188.55'<br>

>> > heartbeat_hostname1='10.65.188.56'<br>

>> > heartbeat_port0=9694<br>

>> > heartbeat_port1=9694<br>

>> > hostname0='10.65.188.55'<br>

>> > hostname1='10.65.188.56'<br>

>> > if_cmd_path='/sbin'<br>

>> > if_down_cmd='/usr/bin/sudo /sbin/ip addr del $_IP_$/18 dev eth0'<br>

>> > if_up_cmd='/usr/bin/sudo /sbin/ip addr add $_IP_$/18 dev eth0 label<br>

>> eth0:pg'<br>

>> > ignore_leading_white_space=on<br>

>> > insert_lock=off<br>

>> > listen_addresses='*'<br>

>> > listen_backlog_multiplier=1<br>

>> > load_balance_mode=off<br>

>> > lobj_lock_table=''<br>

>> > log_client_messages=off<br>

>> > log_connections=off<br>

>> > log_destination='stderr'<br>

>> > log_directory='/var/log/pgpool_log'<br>

>> > logdir='/var/log'<br>

>> > log_disconnections=off<br>

>> > log_error_verbosity = verbose<br>

>> > log_filename='pgpool-%a.log'<br>

>> > logging_collector=on<br>

>> > log_hostname=off<br>

>> > log_line_prefix='%t: pid %p: '# printf-style string to output at<br>

>> beginning<br>

>> > of each log line.<br>

>> > log_min_messages = debug1<br>

>> > log_per_node_statement=off<br>

>> > log_rotation_age=1d<br>

>> > log_rotation_size=0<br>

>> > log_standby_delay='if_over_threshold'<br>

>> > log_statement=off<br>

>> > log_truncate_on_rotation=on<br>

>> > master_slave_mode=on<br>

>> > master_slave_sub_mode='stream'<br>

>> > max_pool=1<br>

>> > memory_cache_enabled=off<br>

>> > memqcache_auto_cache_invalidation=on<br>

>> > memqcache_cache_block_size=1MB<br>

>> > memqcache_expire=0<br>

>> > memqcache_maxcache=400kB<br>

>> > memqcache_max_num_cache=1000000<br>

>> > memqcache_memcached_host='localhost'<br>

>> > memqcache_memcached_port=11211<br>

>> > memqcache_method='shmem'<br>

>> > memqcache_oiddir='/var/log/pgpool/oiddir'<br>

>> > memqcache_total_size=64MB<br>

>> > num_init_children=160<br>

>> > pcp_listen_addresses='*'<br>

>> > pcp_port=9898<br>

>> > pcp_socket_dir='/var/run/postgresql'<br>

>> > pgpool_port0=9999<br>

>> > pgpool_port1=9999<br>

>> > pid_file_name='/var/run/pgpool.pid'<br>

>> > ping_path='/bin'<br>

>> > pool_passwd='pool_passwd'<br>

>> > port=9999<br>

>> > primary_routing_query_pattern_list=''<br>

>> > read_only_function_list=''<br>

>> > recovery_1st_stage_command='recovery_1st_stage'<br>

>> > recovery_2nd_stage_command=''<br>

>> > recovery_password='**************'<br>

>> > recovery_timeout=90<br>

>> > recovery_user='postgres'<br>

>> > relcache_expire=0<br>

>> > relcache_query_target=primary<br>

>> > relcache_size=256<br>

>> > replicate_select=off<br>

>> > replication_mode=off<br>

>> > replication_stop_on_mismatch=off<br>

>> > reserved_connections=0<br>

>> > reset_query_list='ABORT; DISCARD ALL'<br>

>> > search_primary_node_timeout=0<br>

>> > serialize_accept=off<br>

>> > socket_dir='/var/run/postgresql'<br>

>> > sr_check_database='postgres'<br>

>> > sr_check_password='fabrix'<br>

>> > sr_check_period=1<br>

>> > sr_check_user='****************'<br>

>> > ssl_ciphers='HIGH:MEDIUM:+3DES:!aNULL'<br>

>> > ssl_dh_params_file=''<br>

>> > ssl_ecdh_curve='prime256v1'<br>

>> > ssl=off<br>

>> > ssl_prefer_server_ciphers=off<br>

>> > statement_level_load_balance=off<br>

>> > syslog_facility='LOCAL0'<br>

>> > syslog_ident='pgpool'<br>

>> > trusted_servers=''<br>

>> > use_watchdog=on<br>

>> > wd_authkey=''<br>

>> > wd_de_escalation_command=''<br>

>> > wd_escalation_command=''<br>

>> > wd_heartbeat_deadtime=30<br>

>> > wd_heartbeat_keepalive=2<br>

>> > wd_interval=10<br>

>> > wd_ipc_socket_dir='/var/run/postgresql'<br>

>> > wd_lifecheck_dbname='template1'<br>

>> > wd_lifecheck_method='heartbeat'<br>

>> > wd_lifecheck_password=''<br>

>> > wd_lifecheck_query='SELECT 1'<br>

>> > wd_lifecheck_user='nobody'<br>

>> > wd_life_point=3<br>

>> > wd_monitoring_interfaces_list=''<br>

>> > wd_port0=9000<br>

>> > wd_port1=9000<br>

>> > wd_priority=1<br>

>> > write_function_list=''<br>

>> ><br>

>> *********************************************************************************************<br>

>> ><br>

>> > pool_hba.conf:<br>

>> ><br>

>> *********************************************************************************************<br>

>> > [root@pg-mgrdb1 pgpool_log]# cat /etc/pgpool-II/pool_hba.conf<br>

>> > # TYPE  DATABASE        USER    CIDR-ADDRESS    METHOD<br>

>> > local   all     all     trust<br>

>> > ## IPv4 local connections<br>

>> > host    all     all     <a href="http://127.0.0.1/32" rel="noreferrer noreferrer" target="_blank">127.0.0.1/32</a>    md5<br>

>> > host    all     postgres        <a href="http://10.65.188.55/32" rel="noreferrer noreferrer" target="_blank">10.65.188.55/32</a> md5<br>

>> > host    all     postgres        <a href="http://10.65.188.56/32" rel="noreferrer noreferrer" target="_blank">10.65.188.56/32</a> md5<br>

>> > host    all     postgres        <a href="http://10.65.188.59/32" rel="noreferrer noreferrer" target="_blank">10.65.188.59/32</a> md5<br>

>> > host    all     fabrix  <a href="http://10.65.188.55/32" rel="noreferrer noreferrer" target="_blank">10.65.188.55/32</a> md5<br>

>> > host    all     fabrix  <a href="http://10.65.188.56/32" rel="noreferrer noreferrer" target="_blank">10.65.188.56/32</a> md5<br>

>> > host    all     fabrix  <a href="http://10.65.188.59/32" rel="noreferrer noreferrer" target="_blank">10.65.188.59/32</a> md5<br>

>> > ## Replication Hosts<br>

>> > host    manager fabrix  <a href="http://10.65.188.55/32" rel="noreferrer noreferrer" target="_blank">10.65.188.55/32</a> md5<br>

>> > host    manager fabrix  <a href="http://10.65.188.56/32" rel="noreferrer noreferrer" target="_blank">10.65.188.56/32</a> md5<br>

>> > host    manager fabrix  <a href="http://10.65.188.59/32" rel="noreferrer noreferrer" target="_blank">10.65.188.59/32</a> md5<br>

>> > [root@pg-mgrdb1 pgpool_log]#<br>

>> ><br>

>> *********************************************************************************************<br>

>> ><br>

>> > pool_passwd:<br>

>> ><br>

>> *********************************************************************************************<br>

>> > [root@pg-mgrdb1 pgpool_log]# cat /etc/pgpool-II/pool_passwd<br>

>> > postgres:AESdLGx72XCAdQUybSno/SFuw==<br>

>> > fabrix:AES9Zd8L9XYGiZ5RMycfSFBFw==<br>

>> > [root@pg-mgrdb1 pgpool_log]#<br>

>> ><br>

>> *********************************************************************************************<br>

>> ><br>

>> > I'd highly appreciate any ideas or hints for further investigations.<br>

>> ><br>

>> > BR<br>

>> > Igor Yurchenko<br>

>><br>

</blockquote></div></div></div>