[pgpool-general: 4388] please help me! problem with online recovery
Alessandro - Infopix
alessandro at infopix.it
Wed Feb 3 20:16:12 JST 2016
Hi!
I 've configured two server in HA so as reported by this tutorial.
http://www.pgpool.net/pgpool-web/contrib_docs/watchdog_master_slave_3.3/en.h
tml
I've installed Centos 7, pgpool-II version 3.4.3 and postgres (PostgreSQL)
9.4
I've modified the script for adapt to my configuration and I put them in
postgresql data directory with owner postgres and execution privilege but
nothing works
When my postgres primary server goes down (poweroff) failover commands start
and other server became primary (in the postgres directory recovery.conf
become recovery.done and I can see trigger file that my failover script put)
My failover script:
#!/bin/bash -x
FALLING_NODE=$1 # %d
OLDPRIMARY_NODE=$2 # %P
NEW_PRIMARY=$3 # %H
PGDATA=$4 # %R
if [ $FALLING_NODE = $OLDPRIMARY_NODE ]; then
if [ $UID -eq 0 ]
then
su postgres -c "ssh -T postgres@$NEW_PRIMARY touch $PGDATA/trigger"
else
ssh -T postgres@$NEW_PRIMARY touch $PGDATA/trigger
fi
exit 0;
fi;
exit 0;
when I start old primary server that failed pgpool not start procedure of
online recovery
my recovery_1st_stage file
#!/bin/bash -x
PGDATA=$1
REMOTE_HOST=$2
REMOTE_PGDATA=$3
echo "I have called online recovery " >>
/data/postgreStorage/logifonlinereccaptured.log
PORT=5432
PGHOME=/usr/pgsql-9.4
ARCH=/data/postgreStorage/arch
rm -rf $ARCH/*
ssh -T postgres@$REMOTE_HOST "
LD_LIBRARY_PATH=$PGHOME/lib:LD_LIBRARH_PATH;
rm -rf $REMOTE_PGDATA
echo "mypassword" | $PGHOME/bin/pg_basebackup -h $HOSTNAME -U postgres
--password -D $REMOTE_PGDATA -x -c fast
rm $REMOTE_PGDATA/trigger"
ssh -T postgres@$REMOTE_HOST "rm -rf $ARCH/*"
ssh -T postgres@$REMOTE_HOST "mkdir -p
$REMOTE_PGDATA/pg_xlog/archive_status"
ssh -T postgres@$REMOTE_HOST "
cd $REMOTE_PGDATA;
cp postgresql.conf postgresql.conf.bak;
sed -e 's/#*hot_standby = off/hot_standby = on/' postgresql.conf.bak >
postgresql.conf;
rm -f postgresql.conf.bak;
cat > recovery.conf << EOT
standby_mode = 'on'
primary_conninfo = 'host="$HOSTNAME" port=$PORT user=replica
password=mypassword'
restore_command = 'scp $HOSTNAME:$ARCH/%f %p'
trigger_file = '$PGDATA/trigger'
EOT
"
In directory of postgres data I put pgpool_remote_start with owner postgres
and execution permission
#!/bin/sh
REMOTE_HOST=$1
REMOTE_PGDATA=$2
PGHOME=/usr/pgsql-9.4
ssh -T postgres@$REMOTE_HOST "
LD_LIBRARY_PATH=$PGHOME/lib:LD_LIBRARH_PATH;
$PGHOME/bin/pg_ctl -w -D $REMOTE_PGDATA start 2>/dev/null 1>/dev/null <
/dev/null &"
I've configured pgpool and postgres to use md5 auth but it seems that this
is not a problem because postgresql log and pgpool log don't show errors
When old primary start if do command pcp_node_info 5 server1 9898 user
password 0 , it return server1 5432 2 0.500000
Command pcp_node_info 5 server1 9898 user password 1 return server2 5432 2
0.500000
Command pcp_node_info 5 server1 9898 user password 1 return server2 5432 2
0.500000
Command pcp_node_info 5 server2 9898 user password 0 return server1 5432 3
0.500000
Command pcp_node_info 5 server2 9898 user password 1 return server1 5432 2
0.500000
If I check log of postgresql of server1 I seen that it started as master..
My pgpool configuration file
Server1 and server2 replace my servers' hostname
Server1 default primary (that I start for first) only parameter modified
from default
listen_addresses = '*'
listen_backlog_multipler = 5
backend_hostname0 = 'server1'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/data/postgreStorage/'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'server2'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/data/postgreStorage/'
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
log_connections = on
log_min_messages = debug1
load_balance_mode = on
master_slave_mode = on
sr_check_user = 'postgres'
sr_check_password = 'mypostgrespassword'
follow_master_command = ''
health_check_period = 5
health_check_timeout = 20
health_check_user = 'postgres'
health_check_password = 'my postgres password'
failover_command = '/data/postgreStorage/failover.sh %d %P %H %R'
failback_command = ''
recovery_user = 'postgres'
recovery_password = 'mypostgrespassword'
recovery_1st_stage_command = 'recovery_1st_stage'
recovery_2nd_stage_command = ''
recovery_timeout = 90
client_idle_limit_in_recovery = 0
use_watchdog = on
ping_path = '/usr/bin/'
wd_hostname = 'server1'
wd_port = 9000
wd_authkey = 'password' (same on the two server pgpool config file)
delegate_IP = ''
ifconfig_path = '/usr/sbin/'
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
wd_heartbeat_port = 9694
wd_heartbeat_keepalive = 2
wd_heartbeat_deadtime = 30
heartbeat_destination0 = 'server2'
heartbeat_destination_port0 = 9694
other_pgpool_hostname0 = 'server2'
other_pgpool_port0 = 9999
other_wd_port0 = 9000
Server2 default secondary (that I start for second and pith postgresql
slave) only parameter modified from default
listen_addresses = '*'
port = 9999
pcp_listen_addresses = '*'
pcp_port = 9898
listen_backlog_multiplier = 5
backend_hostname0 = 'server1'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/data/postgreStorage/'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'server2'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/data/postgreStorage/'
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
authentication_timeout = 60
log_connections = on
log_min_messages = debug1
load_balance_mode = on
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 10
sr_check_user = 'postgres'
sr_check_password = 'mypass'
follow_master_command = ''
health_check_period = 5
health_check_timeout = 20
health_check_user = 'postgres'
health_check_password = 'mypass'
health_check_max_retries = 5
health_check_retry_delay = 2
connect_timeout = 10000
failover_command = '/data/postgreStorage/failover.sh %d %P %H %R'
failback_command = ''
fail_over_on_backend_error = on
search_primary_node_timeout = 10
recovery_user = 'postgres'
recovery_password = 'mypass'
recovery_1st_stage_command = 'recovery_1st_stage'
recovery_2nd_stage_command = ''
recovery_timeout = 90
client_idle_limit_in_recovery = 0
use_watchdog = on
ping_path = '/usr/bin/'
wd_hostname = 'server2'
wd_port = 9000
wd_authkey = 'password' (same on the two server pgpool config file)
delegate_IP = ''
ifconfig_path = '/usr/sbin/'
arping_path = '/usr/sbin'
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
wd_heartbeat_port = 9694
wd_heartbeat_keepalive = 2
wd_heartbeat_deadtime = 30
heartbeat_destination0 = 'server1'
heartbeat_destination_port0 = 9694
other_pgpool_hostname0 = 'server1'
other_pgpool_port0 = 9999
other_wd_port0 = 9000
Can I help me?
I'm desperate!!!
Thanksssss
A.Baccanelli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20160203/0bd02023/attachment.htm>
More information about the pgpool-general
mailing list