[pgpool-general: 1062] pgpool-II, watchdog and segfaults
Greg Swallow
gswallow at exacttarget.com
Tue Oct 2 23:12:40 JST 2012
Hi,
I have installed pgpool-II on four Ubuntu Lucid systems and when I enable watchdog, I get constant segfaults:
Oct 2 13:40:59 db1a kernel: [3671199.301094] pgpool[18352]: segfault at 0 ip 00007f80c71b5052 sp 00007fffe7779618 error 4 in libc-2.11.1.so[7f80c7132000+17a000]
Oct 2 13:42:39 db1a kernel: [3671299.305608] pgpool[18261]: segfault at 0 ip 00007f80c71b5052 sp 00007fffe77796a8 error 4 in libc-2.11.1.so[7f80c7132000+17a000]
Oct 2 13:44:19 db1a kernel: [3671399.310278] pgpool[18421]: segfault at 0 ip 00007f80c71b5052 sp 00007fffe7779618 error 4 in libc-2.11.1.so[7f80c7132000+17a000]
I turned debug up to 255 and this is what I see that coincides with a segfault:
Oct 2 13:57:32 db1a pgpool[19174]: I am 19174
Oct 2 13:57:32 db1a pgpool[19174]: pool_initialize_private_backend_status: initialize backend status
...
Oct 2 13:59:13 db1a pgpool[19174]: I am 19174 accept fd 8
Oct 2 13:59:13 db1a pgpool[19174]: Protocol Major: 1234 Minor: 5679 database: user:
Oct 2 13:59:13 db1a pgpool[19174]: SSLRequest from client
Oct 2 13:59:13 db1a pgpool[19174]: Protocol Major: 3 Minor: 0 database: template1 user: (null)
Oct 2 13:59:13 db1a pgpool[19154]: reap_handler called
Oct 2 13:59:13 db1a pgpool[19154]: reap_handler: call wait3
Oct 2 13:59:13 db1a pgpool[19154]: child 19174 exits with status 11 by signal 11
When I disable the watchdog, this behavior stops. I can run keepalived for a virtual IP address if this is a watchdog bug and I can help trace whatever you can guide me through. I searched through the pgpool-II documentation and it seems like the child PID *might* be trying to perform online recovery in a streaming replication scenario, but I haven't configured a recovery user and password? I do not want pgpool-II to automatically fail anything over.
My config:
root at db1a:~# pcp_pool_status 60 localhost 9898 pgpool2 blah
name : listen_addresses
value: *
desc : host name(s) or IP address(es) to listen to
name : port
value: 5431
desc : pgpool accepting port number
name : socket_dir
value: /tmp
desc : pgpool socket directory
name : pcp_port
value: 9898
desc : PCP port # to bind
name : pcp_socket_dir
value: /var/run/pgpool
desc : PCP socket directory
name : enable_pool_hba
value: 1
desc : if true, use pool_hba.conf for client authentication
name : authentication_timeout
value: 20
desc : maximum time in seconds to complete client authentication
name : ssl
value: 0
desc : SSL support
name : ssl_key
value:
desc : path to the SSL private key file
name : ssl_cert
value:
desc : path to the SSL public certificate file
name : ssl_ca_cert
value:
desc : path to a single PEM format file
name : ssl_ca_cert_dir
value:
desc : directory containing CA root certificate(s)
name : num_init_children
value: 10
desc : # of children initially pre-forked
name : max_pool
value: 40
desc : max # of connection pool per child
name : child_life_time
value: 600
desc : if idle for this seconds, child exits
name : child_max_connections
value: 0
desc : if max_connections received, chile exits
name : connection_life_time
value: 0
desc : if idle for this seconds, connection closes
name : client_idle_limit
value: 0
desc : if idle for this seconds, child connection closes
name : log_destination
value: syslog
desc : logging destination
name : print_timestamp
value: 1
desc : if true print time stamp to each log line
name : log_connections
value: 1
desc : if true, print incoming connections to the log
name : log_hostname
value: 0
desc : if true, resolve hostname for ps and log print
name : log_statement
value: 0
desc : if non 0, logs all SQL statements
name : log_per_node_statement
value: 0
desc : if non 0, logs all SQL statements on each node
name : log_standby_delay
value: if_over_threshold
desc : how to log standby delay
name : syslog_facility
value: LOCAL0
desc : syslog local faclity
name : syslog_ident
value: pgpool
desc : syslog program ident string
name : debug_level
value: 255
desc : debug message level
name : pid_file_name
value: /var/run/pgpool/pgpool.pid
desc : path to pid file
name : logdir
value: /var/log/pgpool
desc : PgPool status file logging directory
name : connection_cache
value: 1
desc : if true, cache connection pool
name : reset_query_list
value: ABORT; DISCARD ALL
desc : queries issued at the end of session
name : replication_mode
value: 0
desc : non 0 if operating in replication mode
name : replicate_select
value: 0
desc : non 0 if SELECT statement is replicated
name : insert_lock
value: 1
desc : insert lock
name : lobj_lock_table
value:
desc : table name used for large object replication control
name : replication_stop_on_mismatch
value: 0
desc : stop replication mode on fatal error
name : failover_if_affected_tuples_mismatch
value: 0
desc : failover if affected tuples are mismatch
name : load_balance_mode
value: 1
desc : non 0 if operating in load balancing mode
name : ignore_leading_white_space
value: 1
desc : ignore leading white spaces
name : white_function_list
value:
desc : functions those do not write to database
name : black_function_list
value: nextval,setval
desc : functions those write to database
name : master_slave_mode
value: 1
desc : if true, operate in master/slave mode
name : master_slave_sub_mode
value: stream
desc : master/slave sub mode
name : sr_check_period
value: 10
desc : sr check period
name : sr_check_user
value: pgquery
desc : sr check user
name : delay_threshold
value: 2097152
desc : standby delay threshold
name : follow_master_command
value:
desc : follow master command
name : parallel_mode
value: 0
desc : if non 0, run in parallel query mode
name : enable_query_cache
value: 0
desc : if non 0, use query cache
name : pgpool2_hostname
value: db1a
desc : pgpool2 hostname
name : system_db_hostname
value: localhost
desc : system DB hostname
name : system_db_port
value: 5432
desc : system DB port number
name : system_db_dbname
value: pgpool
desc : system DB name
name : system_db_schema
value: pgpool_catalog
desc : system DB schema name
name : system_db_user
value: pgpool
desc : user name to access system DB
name : health_check_period
value: 15
desc : health check period
name : health_check_timeout
value: 10
desc : health check timeout
name : health_check_user
value: pgquery
desc : health check user
name : health_check_max_retries
value: 3
desc : health check max retries
name : health_check_retry_delay
value: 1
desc : health check retry delay
name : failover_command
value:
desc : failover command
name : failback_command
value:
desc : failback command
name : fail_over_on_backend_error
value: 1
desc : fail over on backend error
name : recovery_user
value:
desc : online recovery user
name : recovery_1st_stage_command
value:
desc : execute a command in first stage.
name : recovery_2nd_stage_command
value:
desc : execute a command in second stage.
name : recovery_timeout
value: 90
desc : max time in seconds to wait for the recovering node's postmaster
name : client_idle_limit_in_recovery
value: 0
desc : if idle for this seconds, child connection closes in recovery 2n
name : relcache_expire
value: 0
desc : relation cache expiration time in seconds
name : parallel_mode
value: 0
desc : if non 0, run in parallel query mode
name : enable_query_cache
value: 0
desc : if non 0, use query cache
name : pgpool2_hostname
value: db1a
desc : pgpool2 hostname
name : system_db_hostname
value: localhost
desc : system DB hostname
name : system_db_port
value: 5432
desc : system DB port number
name : system_db_dbname
value: pgpool
desc : system DB name
name : system_db_schema
value: pgpool_catalog
desc : system DB schema name
name : system_db_user
value: pgpool
desc : user name to access system DB
name : use_watchdog
value: 1
desc : non 0 if operating in use_watchdog
name : trusted_servers
value: 172.26.42.254,db1.stg.cotweet.com,db1a.stg.cotweet.com,db1b.stg.cotweet.com
desc : upper server list to observe connection
name : delegate_IP
value: 172.26.42.25
desc : delegate IP address of master pgpool
name : wd_port
value: 9000
desc : watchdog port number
name : wd_interval
value: 10
desc : life check interval (second)
name : ping_path
value: /bin
desc : path to ping command
name : ifconfig_path
value: /sbin
desc : path to ifconfig command
name : if_up_cmd
value: ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.255
desc : virtual interface up command with full parameters
name : if_down_cmd
value: ifconfig eth0:0 down
desc : virtual interface down command with full parameters
name : arping_path
value: /usr/bin
desc : path to arping command
name : arping_cmd
value: arping -U $_IP_$ -w 1
desc : send ARP REQUESTi to neighbour host
name : wd_life_point
value: 3
desc : retry times of life check
name : wd_lifecheck_query
value: SELECT 1
desc : lifecheck query to pgpool from watchdog
name : memory_cache_enabled
value: 0
desc : If true, use the memory cache functionality, false by default
name : memqcache_method
value: shmem
desc : Cache store method. either shmem(shared memory) or Memcached. sh
name : memqcache_memcached_host
value: localhost
desc : Memcached host name. Mandatory if memqcache_method=memcached
name : memqcache_memcached_port
value: 11211
desc : Memcached port number. Mondatory if memqcache_method=memcached
name : memqcache_total_size
value: 67108864
desc : Total memory size in bytes for storing memory cache. Mandatory i
name : memqcache_max_num_cache
value: 1000000
desc : Total number of cache entries
name : memqcache_expire
value: 0
desc : Memory cache entry life time specified in seconds. 60 by default
name : memqcache_auto_cache_invalidation
value: 0
desc : If true, invalidation of query cache is triggered by correspondi
name : memqcache_maxcache
value: 409600
desc : Maximum SELECT result size in bytes
name : memqcache_cache_block_size
value: 1048576
desc : Cache block size in bytes. 8192 by default
name : memqcache_cache_oiddir
value: /var/log/pgpool/oiddir
desc : Tempory work directory to record table oids
name : memqcache_stats_start_time
value: Thu Jan 1 00:00:00 1970
desc : Start time of query cache stats
name : memqcache_no_cache_hits
value: 0
desc : Number of SELECTs not hitting query cache
name : memqcache_cache_hits
value: 0
desc : Number of SELECTs hitting query cache
name : white_memqcache_table_list
value:
desc : tables to memqcache
name : black_memqcache_table_list
value:
desc : tables not to memqcache
name : backend_hostname0
value: db1.stg.cotweet.com
desc : backend #0 hostname
name : backend_port0
value: 5432
desc : backend #0 port number
name : backend_weight0
value: 0.333333
desc : weight of backend #0
name : backend_data_directory0
value:
desc : data directory for backend #0
name : backend_status0
value: 1
desc : status of backend #0
name : standby_delay0
value: 0
desc : standby delay of backend #0
name : backend_flag0
value: DISALLOW_TO_FAILOVER
desc : backend #0 flag
name : backend_hostname1
value: db1a.stg.cotweet.com
desc : backend #1 hostname
name : backend_port1
value: 5432
desc : backend #1 port number
name : backend_weight1
value: 0.333333
desc : weight of backend #1
name : backend_data_directory1
value:
desc : data directory for backend #1
name : backend_status1
value: 1
desc : status of backend #1
name : standby_delay1
value: 0
desc : standby delay of backend #1
name : backend_flag1
value: DISALLOW_TO_FAILOVER
desc : backend #1 flag
name : backend_hostname2
value: db1b.stg.cotweet.com
desc : backend #2 hostname
name : backend_port2
value: 5432
desc : backend #2 port number
name : backend_weight2
value: 0.333333
desc : weight of backend #2
name : backend_data_directory2
value:
desc : data directory for backend #2
name : backend_status2
value: 3
desc : status of backend #2
name : standby_delay2
value: 0
desc : standby delay of backend #2
name : backend_flag2
value: DISALLOW_TO_FAILOVER
desc : backend #2 flag
name : other_pgpool_hostname1
value: db1b.stg.cotweet.com
desc : pgpool #1 hostname
name : other_pgpool_port1
value: 5431
desc : pgpool #1 port number
name : other_pgpool_wd_port1
value: 9000
desc : pgpool #1 watchdog port number
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2835 bytes
Desc: not available
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20121002/c14a507e/attachment.p7s>
More information about the pgpool-general
mailing list