SHOW POOL_HEALTH_CHECK_STATS displays health check (see Section 5.8) statistic data mostly collected by health check process. This command helps Pgpool-II admin to study events related to health check. For example, admin can easily locate the failover event in the log file by looking at "last_failed_health_check" column. Another example is finding unstable connection to backend by evaluating "average_retry_count" column. If particular node shows higher retry count than other node, there may be problem to the connection to the backend.
Table 1 shows each column name and its description.
Table 1. Statistics data shown by pool_health_check_stats command
Column Name | Description |
---|---|
node_id | Backend node id. |
hostname | Backend hostname or UNIX domain socket path. |
port | Backend port number. |
status | Backend status. One of up, down, waiting, unused or quarantine. |
role | Role of the node. Either primary or standby in streaming replication mode. Either main or replica in other mode. |
last_status_change | Timestamp of last backend status changed. |
total_count | Number of health check count in total. |
success_count | Number of successful health check count in total. |
fail_count | Number of failed health check count in total. |
skip_count | Number of skipped health check count in total. If the node is already down, health check skips the node. |
retry_count | Number of retried health check count in total. |
average_retry_count | Number of average retried health check count in a health check session. |
max_retry_count | Number of maximum retried health check count in a health check session. |
max_duration | Maximum health check duration in Millie seconds. If a health check session retries, the health check duration is sum of each retried health check. |
min_duration | Minimum health check duration in Millie seconds. If a health check session retries, the health check duration is sum of each retried health check. |
average_duration | Average health check duration in Millie seconds. If a health check session retries, the health check duration is sum of each retried health check. |
last_health_check | Timestamp of last health check. If heath check does not performed yet, empty string. |
last_successful_health_check | Timestamp of last successful health check. If heath check does not succeeds yet, empty string. |
last_skip_health_check | Timestamp of last skipped health check. If heath check is not skipped yet, empty string. Note that it is possible that this field is an empty string even if the status is down. In this case failover was triggered by other than health check process. |
last_failed_health_check | Timestamp of last failed health check. If heath check does not fail yet, empty string. Note that it is possible that this field is an empty string even if the status is down. In this case failover was triggered by other than health check process. |
Here is an example session:
test=# show pool_health_check_stats; -[ RECORD 1 ]----------------+-------------------- node_id | 0 hostname | /tmp port | 11002 status | up role | primary last_status_change | 2020-01-26 19:08:45 total_count | 27 success_count | 27 fail_count | 0 skip_count | 0 retry_count | 0 average_retry_count | 0.000000 max_retry_count | 0 max_duration | 9 min_duration | 2 average_duration | 6.296296 last_health_check | 2020-01-26 19:12:45 last_successful_health_check | 2020-01-26 19:12:45 last_skip_health_check | last_failed_health_check | -[ RECORD 2 ]----------------+-------------------- node_id | 1 hostname | /tmp port | 11003 status | down role | standby last_status_change | 2020-01-26 19:11:48 total_count | 19 success_count | 12 fail_count | 1 skip_count | 6 retry_count | 3 average_retry_count | 0.230769 max_retry_count | 3 max_duration | 83003 min_duration | 0 average_duration | 6390.307692 last_health_check | 2020-01-26 19:12:48 last_successful_health_check | 2020-01-26 19:10:15 last_skip_health_check | 2020-01-26 19:12:48 last_failed_health_check | 2020-01-26 19:11:48