[pgpool-general: 8875] Re: pgpool sub-processes get terminated by segment fault

Zhaoxun Yan yan.zhaoxun at gmail.com
Fri Jul 7 16:03:46 JST 2023


Hi Bo!

I changed the configuration to process_management_mode =static,
and everything related to it as commented, but the mal-function persists.
Here is the log in debug mode:
2023-07-07 14:58:48.964: main pid 20671: DEBUG:  BackendDesc: 113672 bytes
requested for shared memo
ry
2023-07-07 14:58:48.964: main pid 20671: DEBUG:  pool_coninfo_size:
num_init_children (32) * max_poo
l (4) * MAX_NUM_BACKENDS (128) * sizeof(ConnectionInfo) (160) = 2621440
bytes requested for shared m
emory
2023-07-07 14:58:48.964: main pid 20671: DEBUG:  ProcessInfo:
num_init_children (32) * sizeof(Proces
sInfo) (48) = 1536 bytes requested for shared memory
2023-07-07 14:58:48.964: main pid 20671: DEBUG:  UserSignalSlot: 24 bytes
requested for shared memor
y
2023-07-07 14:58:48.964: main pid 20671: DEBUG:  POOL_REQUEST_INFO: 5272
bytes requested for shared
memory
2023-07-07 14:58:48.964: main pid 20671: DEBUG:  stat_shared_memory_size:
9216 bytes requested for s
hared memory
2023-07-07 14:58:48.964: main pid 20671: LOG:
 health_check_stats_shared_memory_size: requested size
: 12288
2023-07-07 14:58:48.964: main pid 20671: DEBUG:  SI_ManageInfo: 24 bytes
requested for shared memory
2023-07-07 14:58:48.964: main pid 20671: LOG:  memory cache initialized
2023-07-07 14:58:48.964: main pid 20671: DETAIL:  memcache blocks :64
2023-07-07 14:58:48.964: main pid 20671: DEBUG:  shared_memory_cache_size:
67108864
2023-07-07 14:58:48.964: main pid 20671: DEBUG:  shared_memory_fsmm_size: 64
2023-07-07 14:58:48.964: main pid 20671: DEBUG:  pool_hash_size: 67108880
2023-07-07 14:58:48.964: main pid 20671: DEBUG:  POOL_QUERY_CACHE_STATS: 24
bytes requested for shared memory
2023-07-07 14:58:48.964: main pid 20671: LOG:  allocating (136981824) bytes
of shared memory segment
2023-07-07 14:58:48.964: main pid 20671: LOG:  allocating shared memory
segment of size: 136981824
2023-07-07 14:58:49.041: main pid 20671: DEBUG:  pool_coninfo_size:
num_init_children (32) * max_pool (4) * MAX_NUM_BACKENDS (128) *
sizeof(ConnectionInfo) (160) = 2621440 bytes requested for shared memory
2023-07-07 14:58:49.041: main pid 20671: LOG:
 health_check_stats_shared_memory_size: requested size: 12288
2023-07-07 14:58:49.041: main pid 20671: LOG:
 health_check_stats_shared_memory_size: requested size: 12288
2023-07-07 14:58:49.041: main pid 20671: LOG:  memory cache initialized
2023-07-07 14:58:49.041: main pid 20671: DETAIL:  memcache blocks :64
2023-07-07 14:58:49.041: main pid 20671: DEBUG:  shared_memory_cache_size:
67108864
2023-07-07 14:58:49.041: main pid 20671: DEBUG:  memory cache request size
: 67108864
2023-07-07 14:58:49.041: main pid 20671: DEBUG:  shared_memory_fsmm_size: 64
2023-07-07 14:58:49.044: main pid 20671: LOG:  pool_discard_oid_maps:
discarded memqcache oid maps
2023-07-07 14:58:49.058: main pid 20671: LOG:  unix_socket_directories[0]:
/run/.s.PGSQL.9999
2023-07-07 14:58:49.059: main pid 20671: LOG:  listen address[0]: *
2023-07-07 14:58:49.059: main pid 20671: LOG:  Setting up socket for
0.0.0.0:9999
2023-07-07 14:58:49.059: main pid 20671: LOG:  Setting up socket for :::9999
2023-07-07 14:58:49.061: child pid 20680: DEBUG:  initializing backend
status
2023-07-07 14:58:49.061: child pid 20679: DEBUG:  initializing backend
status
2023-07-07 14:58:49.061: child pid 20678: DEBUG:  initializing backend
status
2023-07-07 14:58:49.061: child pid 20681: DEBUG:  initializing backend
status
2023-07-07 14:58:49.062: child pid 20677: DEBUG:  initializing backend
status
2023-07-07 14:58:49.062: child pid 20682: DEBUG:  initializing backend
status
2023-07-07 14:58:49.062: child pid 20683: DEBUG:  initializing backend
status
2023-07-07 14:58:49.062: child pid 20676: DEBUG:  initializing backend
status
2023-07-07 14:58:49.064: child pid 20684: DEBUG:  initializing backend
status
2023-07-07 14:58:49.064: child pid 20685: DEBUG:  initializing backend
status
2023-07-07 14:58:49.064: child pid 20686: DEBUG:  initializing backend
status
2023-07-07 14:58:49.064: child pid 20675: DEBUG:  initializing backend
status
2023-07-07 14:58:49.067: child pid 20697: DEBUG:  initializing backend
status
2023-07-07 14:58:49.067: child pid 20698: DEBUG:  initializing backend
status
2023-07-07 14:58:49.067: child pid 20699: DEBUG:  initializing backend
status
2023-07-07 14:58:49.067: child pid 20700: DEBUG:  initializing backend
status
2023-07-07 14:58:49.068: child pid 20674: DEBUG:  initializing backend
status
2023-07-07 14:58:49.069: main pid 20671: DEBUG:
 find_primary_node_repeatedly: not in streaming replication mode
2023-07-07 14:58:49.069: main pid 20671: LOG:  listen address[0]: localhost
2023-07-07 14:58:49.069: main pid 20671: LOG:  Setting up socket for
::1:9898
2023-07-07 14:58:49.069: main pid 20671: LOG:  Setting up socket for
127.0.0.1:9898
2023-07-07 14:58:49.069: child pid 20701: DEBUG:  initializing backend
status
2023-07-07 14:58:49.070: child pid 20705: DEBUG:  initializing backend
status
2023-07-07 14:58:49.070: child pid 20704: DEBUG:  initializing backend
status
2023-07-07 14:58:49.070: sr_check_worker pid 20707: LOG:  process started
2023-07-07 14:58:49.070: pcp_main pid 20706: DEBUG:  I am PCP child with
pid:20706
2023-07-07 14:58:49.070: sr_check_worker pid 20707: DEBUG:  I am 20707
2023-07-07 14:58:49.070: main pid 20671: LOG:  pgpool-II successfully
started. version 4.4.3 (nurikoboshi)
2023-07-07 14:58:49.070: child pid 20703: DEBUG:  initializing backend
status
2023-07-07 14:58:49.070: pcp_main pid 20706: LOG:  PCP process: 20706
started
2023-07-07 14:58:49.071: child pid 20702: DEBUG:  initializing backend
status
2023-07-07 14:59:12.379: child pid 20684: DEBUG:  reading startup packet
2023-07-07 14:59:12.379: child pid 20684: DETAIL:  Protocol Major: 1234
Minor: 5679 database:  user:
2023-07-07 14:59:12.379: child pid 20684: DEBUG:  forwarding error message
to frontend
2023-07-07 14:59:12.379: child pid 20684: FATAL:  pgpool is not accepting
any new connections
2023-07-07 14:59:12.379: child pid 20684: DETAIL:  all backend nodes are
down, pgpool requires at least one valid node
2023-07-07 14:59:12.379: child pid 20684: HINT:  repair the backend nodes
and restart pgpool
2023-07-07 14:59:12.380: main pid 20671: LOG:  reaper handler
2023-07-07 14:59:12.380: main pid 20671: DEBUG:  child process with pid:
20684 exits with status 256
2023-07-07 14:59:12.380: main pid 20671: DEBUG:  fork a new child process
with pid: 20711
2023-07-07 14:59:12.380: main pid 20671: LOG:  reaper handler: exiting
normally
2023-07-07 14:59:12.380: child pid 20711: DEBUG:  initializing backend
status

And attached is the new configuration.

On Wed, Jul 5, 2023 at 3:58 PM Bo Peng <pengbo at sraoss.co.jp> wrote:

> Hi,
>
> Thank you for sharing the configuration file.
>
> You are using "dynamic process management mode".
> It seems if max_spare_children is greater than num_init_children,
> segment fault occurs.
>
> I think it is a bug of pgpool.
> I will share your report with the developer who is in charge of dynamic
> process management mode.
>
> On Wed, 5 Jul 2023 13:39:51 +0800
> Zhaoxun Yan <yan.zhaoxun at gmail.com> wrote:
>
> > On Wed, Jul 5, 2023 at 1:00 PM Bo Peng <pengbo at sraoss.co.jp> wrote:
> >
> > > Hi,
> > >
> > > I tested 4.4.3 and it should not happen normally.
> > > Could you share your pgpool.conf?
> > >
> > > On Wed, 5 Jul 2023 12:03:29 +0800
> > > Zhaoxun Yan <yan.zhaoxun at gmail.com> wrote:
> > >
> > > > Hi guys!
> > > >
> > > > Is it a bug? I started local postgres as  backend_hostname0 =
> > > '172.17.0.2'
> > > > And it is visitable:
> > > > # psql -h 172.17.0.2 -p 5432 -U checker template1
> > > > Password for user checker:
> > > > psql (13.10)
> > > > Type "help" for help.
> > > >
> > > > template1=> \q
> > > >
> > > > Although pgpool is listening on 9999:
> > > > # ss -tlnp
> > > > State        Recv-Q       Send-Q             Local Address:Port
> > > > Peer Address:Port       Process
> > > > LISTEN       0            244                      0.0.0.0:5432
> > > >      0.0.0.0:*           users:(("postgres",pid=8680,fd=6))
> > > > LISTEN       0            64                     127.0.0.1:9898
> > > >      0.0.0.0:*
> > > >
> > >
> users:(("pgpool",pid=8780,fd=11),("pgpool",pid=8773,fd=11),("pgpool",pid=8720,fd=11))
> > > > LISTEN       0            64                       0.0.0.0:9999
> > > >      0.0.0.0:*
> > > >
> > >
> users:(("pgpool",pid=8780,fd=5),("pgpool",pid=8773,fd=5),("pgpool",pid=8754,fd=5),("pgpool",pid=8753,fd=5),("pgpool",pid=8752,fd=5),("pgpool",pid=8751,fd=5),("pgpool",pid=8750,fd=5),("pgpool",pid=8749,fd=5),("pgpool",pid=8748,fd=5),("pgpool",pid=8747,fd=5),("pgpool",pid=8746,fd=5),("pgpool",pid=8745,fd=5),("pgpool",pid=8744,fd=5),("pgpool",pid=8742,fd=5),("pgpool",pid=8741,fd=5),("pgpool",pid=8740,fd=5),("pgpool",pid=8739,fd=5),("pgpool",pid=8738,fd=5),("pgpool",pid=8737,fd=5),("pgpool",pid=8736,fd=5),("pgpool",pid=8735,fd=5),("pgpool",pid=8734,fd=5),("pgpool",pid=8733,fd=5),("pgpool",pid=8732,fd=5),("pgpool",pid=8731,fd=5),("pgpool",pid=8730,fd=5),("pgpool",pid=8729,fd=5),("pgpool",pid=8728,fd=5),("pgpool",pid=8727,fd=5),("pgpool",pid=8726,fd=5),("pgpool",pid=8724,fd=5),("pgpool",pid=8723,fd=5),("pgpool",pid=8720,fd=5))
> > > > LISTEN       0            128                      0.0.0.0:22
> > > >      0.0.0.0:*           users:(("sshd",pid=1395,fd=3))
> > > > LISTEN       0            244                         [::]:5432
> > > >         [::]:*           users:(("postgres",pid=8680,fd=7))
> > > > LISTEN       0            64                         [::1]:9898
> > > >         [::]:*
> > > >
> > >
> users:(("pgpool",pid=8780,fd=10),("pgpool",pid=8773,fd=10),("pgpool",pid=8720,fd=10))
> > > > LISTEN       0            64                          [::]:9999
> > > >         [::]:*
> > > >
> > >
> users:(("pgpool",pid=8780,fd=6),("pgpool",pid=8773,fd=6),("pgpool",pid=8754,fd=6),("pgpool",pid=8753,fd=6),("pgpool",pid=8752,fd=6),("pgpool",pid=8751,fd=6),("pgpool",pid=8750,fd=6),("pgpool",pid=8749,fd=6),("pgpool",pid=8748,fd=6),("pgpool",pid=8747,fd=6),("pgpool",pid=8746,fd=6),("pgpool",pid=8745,fd=6),("pgpool",pid=8744,fd=6),("pgpool",pid=8742,fd=6),("pgpool",pid=8741,fd=6),("pgpool",pid=8740,fd=6),("pgpool",pid=8739,fd=6),("pgpool",pid=8738,fd=6),("pgpool",pid=8737,fd=6),("pgpool",pid=8736,fd=6),("pgpool",pid=8735,fd=6),("pgpool",pid=8734,fd=6),("pgpool",pid=8733,fd=6),("pgpool",pid=8732,fd=6),("pgpool",pid=8731,fd=6),("pgpool",pid=8730,fd=6),("pgpool",pid=8729,fd=6),("pgpool",pid=8728,fd=6),("pgpool",pid=8727,fd=6),("pgpool",pid=8726,fd=6),("pgpool",pid=8724,fd=6),("pgpool",pid=8723,fd=6),("pgpool",pid=8720,fd=6))
> > > > LISTEN       0            128                         [::]:22
> > > >         [::]:*           users:(("sshd",pid=1395,fd=4))
> > > >
> > > > It is not visitable:
> > > > # psql -h 127.0.0.1 -p 9999 -U checker template1
> > > > psql: error: server closed the connection unexpectedly
> > > >         This probably means the server terminated abnormally
> > > >         before or while processing the request.
> > > > Checking out the log, any pgpool subprocess is killed by segment
> fault.
> > > > And I run it again using debug mode, the same thing happens.
> > > > Attached is the pgpool log. Thanks in advance.
> > >
> > >
> > > --
> > > Bo Peng <pengbo at sraoss.co.jp>
> > > SRA OSS LLC
> > > TEL: 03-5979-2701 FAX: 03-5979-2702
> > > URL: https://www.sraoss.co.jp/
> > >
>
>
> --
> Bo Peng <pengbo at sraoss.co.jp>
> SRA OSS LLC
> TEL: 03-5979-2701 FAX: 03-5979-2702
> URL: https://www.sraoss.co.jp/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20230707/05c34d46/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pgpool.conf
Type: application/octet-stream
Size: 52928 bytes
Desc: not available
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20230707/05c34d46/attachment.obj>


More information about the pgpool-general mailing list