In this tutrial, I explain the simple way to try "Watchdog".
What you need is 2 Linux boxes in which pgpool-II is installed and a PostgreSQL in the same machine or in the other one. it is enough that 1 node for backend exists.
You can use on memory query cache with pgpool in any mode: replication mode, master/slave mode and raw mode.
I use "osspc16" as an Active node and "osspc20" as a Standby node. "Someserver" means one of them.
Set the parameters in both of active and standby nodes.
First of all, set the flag to enable watchdog "use_watchdog" on.
use_watchdog = on # Activates watchdog
Specify up stream servers (e.g. application servers). To be blank is OK.
trusted_servers = '' # trusted server list which are used # to confirm network connection # (hostA,hostB,hostC,...)
Specify the port of watchdog.
wd_port = 9000 # port number for watchdog service
Specify the IP address for virtual IP address in "delegate_IP" which must not not be used by someone else.
delegate_IP = '133.137.177.143' # delegate IP address
Next, set parameters for each pgpool.
Specify "other_pgpool_hostname0", "other_pgpool_port0" and "other_wd_port0" where pgpool and watchdog to monitor.
[*] Other_pgpool_hostname0 must be the value returned by hostname command.
other_pgpool_hostname0 = 'osspc20' # Host name or IP address to connect to for other pgpool 0 other_pgpool_port0 = 9999 # Port number for othet pgpool 0 other_wd_port0 = 9000 # Port number for othet watchdog 0
other_pgpool_hostname0 = 'osspc16' # Host name or IP address to connect to for other pgpool 0 other_pgpool_port0 = 9999 # Port number for othet pgpool 0 other_wd_port0 = 9000 # Port number for othet watchdog 0
Start pgpool in each servers as root with "-n" (which means not-daemon mode) and redirect log messages into pgpool.log.
Start pgpool in an Active server.
[user@osspc16]$ su - [root@osspc16]# {installed_dir}/bin/pgpool -n -f {installed_dir}/etc/pgpool.conf > pgpool.log 2>&1
Log messages show that pgpool has the virtual IP address and starts watchdog process.
ERROR: wd_create_send_socket: connect() is failed(Connection refused) LOG: wd_escalation: eslcalated to master pgpool ERROR: wd_create_send_socket: connect() is failed(Connection refused) LOG: wd_escalation: escaleted to delegate_IP holder LOG: wd_init: start watchdog LOG: pgpool-II successfully started. version 3.2beta1 (namameboshi)
Start pgpool in Standby server.
[user@osspc20]$ su - [root@osspc20]# {installed_dir}/bin/pgpool -n -f {installed_dir}/etc/pgpool.conf > pgpool.log 2>&1
When the all watchdog listed in other_pgpool_hostname get started, lifecheck starts. In this case osspc16 is the only other watchdog besides myself, lifecheck has started now.
LOG: wd_init: start watchdog LOG: pgpool-II successfully started. version 3.2beta1 (namameboshi) LOG: watchdog: lifecheck started
And in an Active, lifecheck has started.
LOG: watchdog: lifecheck started
Confirm to ping to the virtual IP address.
[user@someserver]$ ping 133.137.177.142 PING 133.137.177.143 (133.137.177.143) 56(84) bytes of data. 64 bytes from 133.137.177.143: icmp_seq=1 ttl=64 time=0.328 ms 64 bytes from 133.137.177.143: icmp_seq=2 ttl=64 time=0.264 ms 64 bytes from 133.137.177.143: icmp_seq=3 ttl=64 time=0.412 ms
Confirm if the Active server which started at first has the virtual IP address.
[root@osspc16]# ifconfig eth0 ... eth0:0 inet addr:133.137.177.143 ... lo ...
Confirm if the Standby server which started not at first doesn't have the virtual IP address.
[root@osspc20]# ifconfig eth0 ... lo ...
Try to connect PostgreSQL by "psql -h delegate_IP -p port".
[user@someserver]$ psql -h 133.137.177.142 -p 9999 -l
Confirm how the Standby server works when the Active server can't provide its service.
Stop pgpool in the Active server.
[root@osspc16]# {installed_dir}/bin/pgpool stop
Then, the Standby server starts to use the virtual IP address. Log shows:
LOG: wd_escalation: eslcalated to master pgpool ERROR: wd_create_send_socket: connect() is failed(Connection refused) LOG: wd_escalation: escaleted to delegate_IP holder
Confirm to ping to the virtual IP address.
[user@someserver]$ ping 133.137.177.142 PING 133.137.177.143 (133.137.177.143) 56(84) bytes of data. 64 bytes from 133.137.177.143: icmp_seq=1 ttl=64 time=0.328 ms 64 bytes from 133.137.177.143: icmp_seq=2 ttl=64 time=0.264 ms 64 bytes from 133.137.177.143: icmp_seq=3 ttl=64 time=0.412 ms
Confirm that the Active server doesn't use the virtual IP address any more.
[root@osspc16]# ifconfig eth0 ... lo ...
Confirm that the Standby server uses the virtual IP address.
[root@osspc20]# ifconfig eth0 ... eth0:0 inet addr:133.137.177.143 ... lo ...
Try to connect PostgreSQL by "psql -h delegate_IP -p port".
[user@someserver]$ psql -h 133.137.177.142 -p 9999 -l
There are the parameters about watchdog's monitoring.
Specify the interval to check "wd_interval", the count to retry "wd_life_point", the qyery to check "wd_lifecheck_query".
wd_interval = 10 # lifecheck interval (sec) > 0 wd_life_point = 3 # lifecheck retry times wd_lifecheck_query = 'SELECT 1' # lifecheck query to pgpool from watchdog
There are the parameters for switching the virtual IP address.
Specify switching commands "if_up_cmd", "if_down_cmd", the path to them "ifconfig_path", the command executed after switching to send ARP request"arping_cmd" and the path to it "arping_path".
ifconfig_path = '/sbin' # ifconfig command path if_up_cmd = 'ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.0' # startup delegate IP command if_down_cmd = 'ifconfig eth0:0 down' # shutdown delegate IP command arping_path = '/usr/sbin' # arping command path arping_cmd = 'arping -U $_IP_$ -w 1'