[pgpool-general: 8171] Re: Problems taking node offline

Jon SCHEWE jon.schewe at raytheon.com
Tue May 24 01:18:52 JST 2022


Any ideas on this?


Jon Schewe

Principal Software Systems Technologist

C: +1 612.263.2718

O: +1 952.545.5720

jon.schewe at raytheon.com

Raytheon BBN

Raytheon Intelligence & Space

5775 Wayzata Blvd. Suite 630

St. Louis Park, MN 55416


RTX.com<https://www.rtx.com/> | LinkedIn<https://www.linkedin.com/company/raytheontechnologies> | Twitter<https://twitter.com/raytheontech> | Instagram<https://www.instagram.com/raytheontechnologies>

________________________________
From: pgpool-general <pgpool-general-bounces at pgpool.net> on behalf of Jon SCHEWE <jon.schewe at raytheon.com>
Sent: Friday, May 13, 2022 12:30
To: Bo Peng <pengbo at sraoss.co.jp>
Cc: pgpool-general at pgpool.net <pgpool-general at pgpool.net>
Subject: [External] [pgpool-general: 8159] Re: Problems taking node offline

I was able to do some experimentation this today.

To take a frontend offline I believe I just need to stop the pgpool process on that system and then let pgpool figure out the new primary and grab the virtual IP address. Correct?

To take a backend offline I believe I use pcp_detach_node. Correct?

As far as testing with "use_watchdog = off", I just tried that. I changed the parameter on all 3 of my pgpool hosts. I then restarted the pgpool process on all 3 hosts. I noticed that no host picked up the virtual IP, I'm assuming that is because the watchdog is off, correct?
I manually assigned the virtual IP to one of the hosts.
Before I execute any commands:
postgres=# show pool_nodes;
 node_id |       hostname       | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | psql-01.mgmt.bbn.com | 5432 | up     | 0.333333  | primary | 282959     | true              | 0                 |                   |                        | 2022-05-13 12:46:54
 1       | psql-02.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 230916     | false             | 0                 | streaming         | potential              | 2022-05-13 12:46:54
 2       | psql-03.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 370021     | false             | 0                 | streaming         | sync                   | 2022-05-13 12:46:54
(3 rows)

I then executed:
$ pcp_detach_node -h psql.mgmt.bbn.com -p 9897 -U pgpool -g -n 1
Password:
pcp_detach_node -- Command Successful

This took a long time (60 seconds) to finish.

The node does not appear to be offline:
postgres=# show pool_nodes;
 node_id |       hostname       | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | psql-01.mgmt.bbn.com | 5432 | up     | 0.333333  | primary | 311652     | true              | 0                 |                   |                        | 2022-05-13 12:46:54
 1       | psql-02.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 248369     | false             | 0                 | streaming         | potential              | 2022-05-13 12:46:54
 2       | psql-03.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 410891     | false             | 0                 | streaming         | sync                   | 2022-05-13 12:46:54
(3 rows)

The system log for pgpool during this time is attached.

 What am I doing wrong here?








Jon Schewe
Principal Software Systems Technologist
C: +1 612.263.2718
O: +1 952.545.5720
jon.schewe at raytheon.com

Raytheon BBN
Raytheon Intelligence & Space
5775 Wayzata Blvd. Suite 630
St. Louis Park, MN 55416




From: pgpool-general <pgpool-general-bounces at pgpool.net> on behalf of Jon SCHEWE <jon.schewe at raytheon.com>
Sent: Thursday, May 12, 2022 16:14
To: Bo Peng <pengbo at sraoss.co.jp>
Cc: pgpool-general at pgpool.net <pgpool-general at pgpool.net>
Subject: [External] [pgpool-general: 8155] Re: Problems taking node offline

I have not been able to test with the watchdog off, however I am wondering about the proper commands to switch backends and frontends.

I see pcp_detatch_node removes a pgpool frontend.
What command can I use to tell pgpool to switch to a different primary backend? Do I just stop the postgresql process?

Jon Schewe
Principal Software Systems Technologist

C: +1 612.263.2718
O: +1 952.545.5720
jon.schewe at raytheon.com

Raytheon BBN
Raytheon Intelligence & Space
5775 Wayzata Blvd. Suite 630
St. Louis Park, MN 55416

RTX.com | LinkedIn | Twitter | Instagram


From: pgpool-general <pgpool-general-bounces at pgpool.net> on behalf of Jon SCHEWE <jon.schewe at raytheon.com>
Sent: Wednesday, April 27, 2022 13:07
To: Bo Peng <pengbo at sraoss.co.jp>
Cc: pgpool-general at pgpool.net <pgpool-general at pgpool.net>
Subject: [External] [pgpool-general: 8108] Re: Problems taking node offline

> On Tue, 26 Apr 2022 15:01:15 +0000
> Jon SCHEWE <jon.schewe at raytheon.com> wrote:
>
> > >> I want to take a backend node offline and having some trouble with it.
> > >>
> > >> I check the status of my notes:
> > >> template1=> show pool_nodes;
> > >>  node_id |       hostname       | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
> > >> ---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> > >>  0       | psql-01.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 646198     | false             | 0                 | streaming         | sync                   | 2022-04-25 14:19:57
> > >>  1       | psql-02.mgmt.bbn.com | 5432 | up     | 0.333333  | primary | 2115353    | true              | 0                 |                   |                        | 2022-04-25 14:16:24
> > >>  2       | psql-03.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 2913       | false             | 0                 | streaming         | potential              | 2022-04-25 14:24:25
> > >> (3 rows)
> > >>
> > >> I want to take psql-02 offline.
> > >>
> > >> pcp_detach_node -h psql.mgmt.bbn.com -p 9897 -U pgpool -g -n 1
> > >> Password:
> > >> pcp_detach_node -- Command Successful
> > >>
> > >>
> > >> I check the status again:
> > >> template1=> show pool_nodes;
> > >>  node_id |       hostname       | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
> > >> ---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> > >>  0       | psql-01.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 718555     | true              | 0                 | streaming         | sync                   | 2022-04-25 14:19:57
> > >>  1       | psql-02.mgmt.bbn.com | 5432 | up     | 0.333333  | primary | 2373454    | false             | 0                 |                   |                        | 2022-04-25 14:16:24
> > >>  2       | psql-03.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 3310       | false             | 0                 | streaming         | potential              | 2022-04-25 14:24:25
> > >> (3 rows)
> > >>
> > >>
> > >> I still see psql-02 online. Why is that?
> > >
> > >Could you share pgpool.conf
> >
> > Yes, attached.
> >
> > > and full log after running pcp_detach_node?
> >
> > The only log messages are what I sent originally.
> >
> > >Which version of Pgpool-II are you using?
> >
> > 4.1.4
>
> Thank you.
>
> I think watchdog may not be working properly.
> If you run pcp_detach_node, failover_command and follow_master_command should be executed.
> But I could not see the related logs.
>
> Could you check the watchdog status using "pcp_watchdog_info" command?


[jschewe-adm at psql-01 ~]$ pcp_watchdog_info -h psql.mgmt.bbn.com -p 9897 -U pgpool
Password:
3 YES psql-02.mgmt.bbn.com:9898 Linux psql-02 psql-02.mgmt.bbn.com

psql-02.mgmt.bbn.com:9898 Linux psql-02 psql-02.mgmt.bbn.com 9898 9000 4 MASTER
psql-01.mgmt.bbn.com:9898 Linux psql-01 psql-01.mgmt.bbn.com 9898 9000 7 STANDBY
Not_Set psql-03.mgmt.bbn.com 9898 9000 0 DEAD
[jschewe-adm at psql-01 ~]$ psql -h psql.mgmt.bbn.com -p 9898 -U postgres
Password for user postgres:
psql (13.6)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

postgres=# show pool_nodes;
 node_id |       hostname       | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | psql-01.mgmt.bbn.com | 5432 | up     | 0.333333  | primary | 30419799   | false             | 0                 |                   |                        | 2022-04-26 00:19:21
 1       | psql-02.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 20228026   | false             | 0                 | streaming         | potential              | 2022-04-26 10:57:15
 2       | psql-03.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 2974278    | true              | 0                 | streaming         | sync                   | 2022-04-26 11:04:41
(3 rows)

postgres=#


> Does this issue occur if you disable watchdog "use_watchdog = off"?

I will give that a try when I have some downtime.
_______________________________________________
pgpool-general mailing list
pgpool-general at pgpool.net
http://www.pgpool.net/mailman/listinfo/pgpool-general
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20220523/7abe40da/attachment.htm>


More information about the pgpool-general mailing list