[pgpool-general: 8176] Re: Problems taking node offline

Jon SCHEWE jon.schewe at raytheon.com
Wed May 25 05:18:12 JST 2022


>If failover_command and follow_primary_command are specified, they are executed after you run pcp_detach_node.
>
>If you run pcp_detach_node to detach the primary node,
>failover_command promotes a standby to primary and
>follow_primary_command enables the other alive nodes to follow the new primary.
>
>It seems that the failover_command and follow_primary_command didn't executed in your logs.
>Can you successfully detach a standby PostgreSQL node?

No. In fact the test that I sent previously tried to take node_id 1 offline and node_id 0 was the primary.


>If you shutdown the primary PostgreSQL node, will failover be performed correctly?

I'm hesitant to shutdown the primary node right now. We have seen the follow master code work recently. Last weekend we had an issue with the servers and we stopped PostgreSQL on all nodes, started it on one and then used pcp_recovery_node -h psql.mgmt.bbn.com -p 9897 -U pgpool -n <node id> to bring the other nodes online.

>
>I found that the setting parameters of 4.1 and 4.2 are mixed in pgpool.conf.
>I am not sure if it affects the behavior of watchdog.

I started preparing for an upgrade, I would hope this isn't a problem.

>
>> As far as testing with "use_watchdog = off", I just tried that. I changed the parameter on all 3 of my pgpool hosts. I then restarted the pgpool process on all 3 hosts. I noticed that no host picked up the virtual IP, I'm assuming that is because the watchdog is off, correct?
>> I manually assigned the virtual IP to one of the hosts.
>> Before I execute any commands:
>> postgres=# show pool_nodes;
>>  node_id |       hostname       | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
>> ---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>>  0       | psql-01.mgmt.bbn.com | 5432 | up     | 0.333333  | primary | 282959     | true              | 0                 |                   |                        | 2022-05-13 12:46:54
>>  1       | psql-02.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 230916     | false             | 0                 | streaming         | potential              | 2022-05-13 12:46:54
>>  2       | psql-03.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 370021     | false             | 0                 | streaming         | sync                   | 2022-05-13 12:46:54
>> (3 rows)
>>
>> I then executed:
>> $ pcp_detach_node -h psql.mgmt.bbn.com -p 9897 -U pgpool -g -n 1
>> Password:
>> pcp_detach_node -- Command Successful
>>
>> This took a long time (60 seconds) to finish.
>>
>> The node does not appear to be offline:
>> postgres=# show pool_nodes;
>>  node_id |       hostname       | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
>> ---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>>  0       | psql-01.mgmt.bbn.com | 5432 | up     | 0.333333  | primary | 311652     | true              | 0                 |                   |                        | 2022-05-13 12:46:54
>>  1       | psql-02.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 248369     | false             | 0                 | streaming         | potential              | 2022-05-13 12:46:54
>>  2       | psql-03.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 410891     | false             | 0                 | streaming         | sync                   | 2022-05-13 12:46:54
>> (3 rows)
>>
>> The system log for pgpool during this time is attached.
>>
>>  What am I doing wrong here?



More information about the pgpool-general mailing list