[pgpool-general: 8171] Re: Problems taking node offline
Jon SCHEWE
jon.schewe at raytheon.com
Tue May 24 01:18:52 JST 2022
Any ideas on this?
Jon Schewe
Principal Software Systems Technologist
C: +1 612.263.2718
O: +1 952.545.5720
jon.schewe at raytheon.com
Raytheon BBN
Raytheon Intelligence & Space
5775 Wayzata Blvd. Suite 630
St. Louis Park, MN 55416
RTX.com<https://www.rtx.com/> | LinkedIn<https://www.linkedin.com/company/raytheontechnologies> | Twitter<https://twitter.com/raytheontech> | Instagram<https://www.instagram.com/raytheontechnologies>
________________________________
From: pgpool-general <pgpool-general-bounces at pgpool.net> on behalf of Jon SCHEWE <jon.schewe at raytheon.com>
Sent: Friday, May 13, 2022 12:30
To: Bo Peng <pengbo at sraoss.co.jp>
Cc: pgpool-general at pgpool.net <pgpool-general at pgpool.net>
Subject: [External] [pgpool-general: 8159] Re: Problems taking node offline
I was able to do some experimentation this today.
To take a frontend offline I believe I just need to stop the pgpool process on that system and then let pgpool figure out the new primary and grab the virtual IP address. Correct?
To take a backend offline I believe I use pcp_detach_node. Correct?
As far as testing with "use_watchdog = off", I just tried that. I changed the parameter on all 3 of my pgpool hosts. I then restarted the pgpool process on all 3 hosts. I noticed that no host picked up the virtual IP, I'm assuming that is because the watchdog is off, correct?
I manually assigned the virtual IP to one of the hosts.
Before I execute any commands:
postgres=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
0 | psql-01.mgmt.bbn.com | 5432 | up | 0.333333 | primary | 282959 | true | 0 | | | 2022-05-13 12:46:54
1 | psql-02.mgmt.bbn.com | 5432 | up | 0.333333 | standby | 230916 | false | 0 | streaming | potential | 2022-05-13 12:46:54
2 | psql-03.mgmt.bbn.com | 5432 | up | 0.333333 | standby | 370021 | false | 0 | streaming | sync | 2022-05-13 12:46:54
(3 rows)
I then executed:
$ pcp_detach_node -h psql.mgmt.bbn.com -p 9897 -U pgpool -g -n 1
Password:
pcp_detach_node -- Command Successful
This took a long time (60 seconds) to finish.
The node does not appear to be offline:
postgres=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
0 | psql-01.mgmt.bbn.com | 5432 | up | 0.333333 | primary | 311652 | true | 0 | | | 2022-05-13 12:46:54
1 | psql-02.mgmt.bbn.com | 5432 | up | 0.333333 | standby | 248369 | false | 0 | streaming | potential | 2022-05-13 12:46:54
2 | psql-03.mgmt.bbn.com | 5432 | up | 0.333333 | standby | 410891 | false | 0 | streaming | sync | 2022-05-13 12:46:54
(3 rows)
The system log for pgpool during this time is attached.
What am I doing wrong here?
Jon Schewe
Principal Software Systems Technologist
C: +1 612.263.2718
O: +1 952.545.5720
jon.schewe at raytheon.com
Raytheon BBN
Raytheon Intelligence & Space
5775 Wayzata Blvd. Suite 630
St. Louis Park, MN 55416
From: pgpool-general <pgpool-general-bounces at pgpool.net> on behalf of Jon SCHEWE <jon.schewe at raytheon.com>
Sent: Thursday, May 12, 2022 16:14
To: Bo Peng <pengbo at sraoss.co.jp>
Cc: pgpool-general at pgpool.net <pgpool-general at pgpool.net>
Subject: [External] [pgpool-general: 8155] Re: Problems taking node offline
I have not been able to test with the watchdog off, however I am wondering about the proper commands to switch backends and frontends.
I see pcp_detatch_node removes a pgpool frontend.
What command can I use to tell pgpool to switch to a different primary backend? Do I just stop the postgresql process?
Jon Schewe
Principal Software Systems Technologist
C: +1 612.263.2718
O: +1 952.545.5720
jon.schewe at raytheon.com
Raytheon BBN
Raytheon Intelligence & Space
5775 Wayzata Blvd. Suite 630
St. Louis Park, MN 55416
RTX.com | LinkedIn | Twitter | Instagram
From: pgpool-general <pgpool-general-bounces at pgpool.net> on behalf of Jon SCHEWE <jon.schewe at raytheon.com>
Sent: Wednesday, April 27, 2022 13:07
To: Bo Peng <pengbo at sraoss.co.jp>
Cc: pgpool-general at pgpool.net <pgpool-general at pgpool.net>
Subject: [External] [pgpool-general: 8108] Re: Problems taking node offline
> On Tue, 26 Apr 2022 15:01:15 +0000
> Jon SCHEWE <jon.schewe at raytheon.com> wrote:
>
> > >> I want to take a backend node offline and having some trouble with it.
> > >>
> > >> I check the status of my notes:
> > >> template1=> show pool_nodes;
> > >> node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
> > >> ---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> > >> 0 | psql-01.mgmt.bbn.com | 5432 | up | 0.333333 | standby | 646198 | false | 0 | streaming | sync | 2022-04-25 14:19:57
> > >> 1 | psql-02.mgmt.bbn.com | 5432 | up | 0.333333 | primary | 2115353 | true | 0 | | | 2022-04-25 14:16:24
> > >> 2 | psql-03.mgmt.bbn.com | 5432 | up | 0.333333 | standby | 2913 | false | 0 | streaming | potential | 2022-04-25 14:24:25
> > >> (3 rows)
> > >>
> > >> I want to take psql-02 offline.
> > >>
> > >> pcp_detach_node -h psql.mgmt.bbn.com -p 9897 -U pgpool -g -n 1
> > >> Password:
> > >> pcp_detach_node -- Command Successful
> > >>
> > >>
> > >> I check the status again:
> > >> template1=> show pool_nodes;
> > >> node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
> > >> ---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> > >> 0 | psql-01.mgmt.bbn.com | 5432 | up | 0.333333 | standby | 718555 | true | 0 | streaming | sync | 2022-04-25 14:19:57
> > >> 1 | psql-02.mgmt.bbn.com | 5432 | up | 0.333333 | primary | 2373454 | false | 0 | | | 2022-04-25 14:16:24
> > >> 2 | psql-03.mgmt.bbn.com | 5432 | up | 0.333333 | standby | 3310 | false | 0 | streaming | potential | 2022-04-25 14:24:25
> > >> (3 rows)
> > >>
> > >>
> > >> I still see psql-02 online. Why is that?
> > >
> > >Could you share pgpool.conf
> >
> > Yes, attached.
> >
> > > and full log after running pcp_detach_node?
> >
> > The only log messages are what I sent originally.
> >
> > >Which version of Pgpool-II are you using?
> >
> > 4.1.4
>
> Thank you.
>
> I think watchdog may not be working properly.
> If you run pcp_detach_node, failover_command and follow_master_command should be executed.
> But I could not see the related logs.
>
> Could you check the watchdog status using "pcp_watchdog_info" command?
[jschewe-adm at psql-01 ~]$ pcp_watchdog_info -h psql.mgmt.bbn.com -p 9897 -U pgpool
Password:
3 YES psql-02.mgmt.bbn.com:9898 Linux psql-02 psql-02.mgmt.bbn.com
psql-02.mgmt.bbn.com:9898 Linux psql-02 psql-02.mgmt.bbn.com 9898 9000 4 MASTER
psql-01.mgmt.bbn.com:9898 Linux psql-01 psql-01.mgmt.bbn.com 9898 9000 7 STANDBY
Not_Set psql-03.mgmt.bbn.com 9898 9000 0 DEAD
[jschewe-adm at psql-01 ~]$ psql -h psql.mgmt.bbn.com -p 9898 -U postgres
Password for user postgres:
psql (13.6)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.
postgres=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
0 | psql-01.mgmt.bbn.com | 5432 | up | 0.333333 | primary | 30419799 | false | 0 | | | 2022-04-26 00:19:21
1 | psql-02.mgmt.bbn.com | 5432 | up | 0.333333 | standby | 20228026 | false | 0 | streaming | potential | 2022-04-26 10:57:15
2 | psql-03.mgmt.bbn.com | 5432 | up | 0.333333 | standby | 2974278 | true | 0 | streaming | sync | 2022-04-26 11:04:41
(3 rows)
postgres=#
> Does this issue occur if you disable watchdog "use_watchdog = off"?
I will give that a try when I have some downtime.
_______________________________________________
pgpool-general mailing list
pgpool-general at pgpool.net
http://www.pgpool.net/mailman/listinfo/pgpool-general
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20220523/7abe40da/attachment.htm>
More information about the pgpool-general
mailing list