[pgpool-hackers: 219] Unexpected behavior of pgpool-II
Asif Naeem
anaeem.it at gmail.com
Mon Apr 22 05:10:32 JST 2013
Hi,
There is unexpected behavior in pgpool2 that seems to be a bug. Please find
the details as following i.e.
Phenomenon
> pgpool2 configuration option "follow_master_command" special character %H
> (Hostname of the new master node) is not working as expected
> In master slave mode after failover, 2nd node become new master but got
> unexpected hostname (via %H). Please find attached pgpool.conf for details.
Versions
> OS version: RHEL - 64
> pgpool version: pgpool-II version 3.2.1 (namameboshi)
*Details/Observations* :-
> At Start
> node_id | hostname | port | status | lb_weight | role
> ---------+----------+------+--------+-----------+---------
> 0 | dmp02 | 5444 | 2 | 0.333333 | primary
> 1 | dmp03 | 5444 | 2 | 0.333333 | standby
> 2 | dmp04 | 5444 | 2 | 0.333333 | standby
>
> Failover arguments when the user stopped DMP02.
> 2013îN 2åé 8ì˙ ã‡ójì˙ 12:44:12 JST
FAILED_NODE_ID =0
> FAILED_NODE_NAME=dmp02
> FAILED_NODE_PORT=5444
> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
> NEW_MASTER_ID =1
> OLD_MASTER_ID =0
> NEW_MASTER_NAME =dmp03
> OLD_PRIMARY_ID =0
> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
> 0
>
> Then, these arguments told user that the master node has been changed
> 2013îN 2åé 8ì˙ ã‡ójì˙ 12:44:13 JST
> FAILED_NODE_ID =0
> FAILED_NODE_NAME=dmp02
> FAILED_NODE_PORT=5444
> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
> NEW_MASTER_ID =1
> OLD_MASTER_ID =0
> NEW_MASTER_NAME =dmp03
> OLD_PRIMARY_ID =0
> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
> 2013îN 2åé 8ì˙ ã‡ójì˙ 12:44:17 JST
> FAILED_NODE_ID =2
> FAILED_NODE_NAME=dmp04
> FAILED_NODE_PORT=5444
> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
> NEW_MASTER_ID =1
> OLD_MASTER_ID =0
> NEW_MASTER_NAME =dmp02 *<--- *** The master name is unexpected ****
> OLD_PRIMARY_ID =0
> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
>
> node_id | hostname | port | status | lb_weight | role
> ---------+----------+------+--------+-----------+---------
> 0 | dmp02 | 5444 | 2 | 0.333333 | standby
> 1 | dmp03 | 5444 | 2 | 0.333333 | primary
> 2 | dmp04 | 5444 | 2 | 0.333333 | standby
>
> pid | usesysid | usename | application_name | client_addr |
> client_hostname | client_port | backend_start | state
> | sent_location | write_location | flush_location | replay_location |
> sync_priority | sync_state
>
> -------+----------+--------------+------------------+---------------+-----------------+-------------+----------------------------------+-----------+---------------+----------------+----------------+-----------------+---------------+------------
> 20641 | 10 | enterprisedb | dmp02 | 172.24.217.23 |
> | 16948 | 08-FEB-13 17:39:44.078173 +09:00 | streaming |
> 14/C50000E0 | 14/C50000E0 | 14/C50000E0 | 14/C50000E0 |
> 0 | async
> (1 çs)
>
There is one pg_stat_replication record because dmp04 is connecting to dmp02
This phenomenon is reproduced by user. What we need to do is just make the
scripts like follow_master.sh and launch pgpool process with -F [path to
pcp.conf].
1. Before failover
> -bash-3.2$ psql -h 10.1.1.187 -p 9999 -c "show pool_nodes"
> node_id | hostname | port | status | lb_weight | role
> ---------+----------+------+--------+-----------+---------
> 0 | slave1 | 5444 | 2 | 0.333333 | primary
> 1 | slave2 | 5444 | 2 | 0.333333 | standby
> 2 | slave3 | 5444 | 2 | 0.333333 | standby
> (3 行)
2. We stopped slave 1 node with "pg_ctl -D /data -m immediate stop".
3. After failover
> -bash-3.2$ psql -h 10.1.1.187 -p 9999 -c "show pool_nodes"
> node_id | hostname | port | status | lb_weight | role
> ---------+----------+------+--------+-----------+---------
> 0 | slave1 | 5444 | 2 | 0.333333 | standby
> 1 | slave2 | 5444 | 2 | 0.333333 | primary
> 2 | slave3 | 5444 | 2 | 0.333333 | standby
> (3 行)
4. Then user got output like below after failover that is not expected that
is present at the end of follow_master.sh.log i.e.
> NEW_MASTER_NAME =slave1
[root at master1 ~]# cat /tmp/failover.sh.log
> 2013年 3月 13日 水曜日 11:36:28 JST
> FAILED_NODE_ID =0
> FAILED_NODE_NAME=slave1
> FAILED_NODE_PORT=5444
> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
> NEW_MASTER_ID =1
> OLD_MASTER_ID =0
> NEW_MASTER_NAME =slave2
> OLD_PRIMARY_ID =0
> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
> 0
>
[root at master1 ~]# cat /tmp/follow_master.sh.log
> 2013年 3月 13日 水曜日 11:36:30 JST
> FAILED_NODE_ID =0
> FAILED_NODE_NAME=slave1
> FAILED_NODE_PORT=5444
> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
> NEW_MASTER_ID =1
> OLD_MASTER_ID =0
> NEW_MASTER_NAME =slave2
> OLD_PRIMARY_ID =0
> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
> 2013年 3月 13日 水曜日 11:36:32 JST
> FAILED_NODE_ID =2
> FAILED_NODE_NAME=slave3
> FAILED_NODE_PORT=5444
> FAILED_NODE_DATA=/opt/PostgresPlus/9.2AS/data
> NEW_MASTER_ID =1
> OLD_MASTER_ID =0
> NEW_MASTER_NAME =*slave1
> *OLD_PRIMARY_ID =0
> NEW_MASTER_DATA =/opt/PostgresPlus/9.2AS/data
Please guide. Thanks.
Best Regards,
Asif Naeem
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20130422/d447bd42/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pgpool.zip
Type: application/zip
Size: 7111 bytes
Desc: not available
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20130422/d447bd42/attachment.zip>
More information about the pgpool-hackers
mailing list