[pgpool-hackers: 3921] Re: [pgpool-general: 7588] Re: Promote specific backend to primary in streaming replication

Thu Jun 10 21:03:12 JST 2021

Hi,
Isn't this scenario called switchover when we do a planned switch from
primary to standby(or vice versa) and IMO it's an industry's acceptable
term used for this scenario.

I would suggest we should name new flag is '-s' '--switchover'
or if we don't want to call it switchover we can call it '-f' '--force' or
'-f' '--force-promote' instead of 'really-promote'

Regards
Umar Hayat

On Thu, Jun 10, 2021 at 4:28 PM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> Sorry for a duplicating post. I feel I need to discuss with developers
> living in pgpool-hackers.
>
> Previously I posted the PoC patch. Now I decided to create a commit-table
> patch.
> Differences from the PoC patch are:
>
> - The user interface is now pcp_promote, rather than hacked version of
>   pcp_detach_node.
>
> - I added new parameter " -r, --really-promote really promote" to
>   pcp_promote_node command, which is needed to keep the backward
>   compatibility. If the option is specified, pcp_promote_node will
>   detach the current primary and kicks failover command, with the main
>   node id to be set to the node id specified by pcp_promote_command. A
>   follow primary command will run if it is set.
>
> - To pass the new argument to pcp process, I tweaked number of
>   internal functions used in the pcp system.
>
> - Doc patch added.
>
> Here is the sample session of the modified pcp_promote_node.
>
> $ pcp_promote_node --help
> pcp_promote_node - promote a node as new main from pgpool-II
> Usage:
> pcp_promote_node [OPTION...] [node-id]
> Options:
>   -U, --username=NAME    username for PCP authentication
>   -h, --host=HOSTNAME    pgpool-II host
>   -p, --port=PORT        PCP port number
>   -w, --no-password      never prompt for password
>   -W, --password         force password prompt (should happen
> automatically)
>   -n, --node-id=NODEID   ID of a backend node
>   -g, --gracefully       promote gracefully(optional)
>   -r, --really-promote   really promote(optional) <-- added
>   -d, --debug            enable debug message (optional)
>   -v, --verbose          output verbose messages
>   -?, --help             print this help
>
> # start with 4-node cluster. Primary is node 0.
>
> $ psql -p 11000 -c "show pool_nodes" test
>  node_id | hostname | port  | status | pg_status | lb_weight |  role   |
> pg_role | select_cnt | load_balance_node | replication_delay |
> replication_state | replication_sync_state | last_status_change
>
> ---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>  0       | /tmp     | 11002 | up     | up        | 0.250000  | primary |
> primary | 0          | false             | 0                 |
>      |                        | 2021-06-10 20:17:41
>  1       | /tmp     | 11003 | up     | up        | 0.250000  | standby |
> standby | 0          | false             | 0                 | streaming
>      | async                  | 2021-06-10 20:17:41
>  2       | /tmp     | 11004 | up     | up        | 0.250000  | standby |
> standby | 0          | true              | 0                 | streaming
>      | async                  | 2021-06-10 20:17:41
>  3       | /tmp     | 11005 | up     | up        | 0.250000  | standby |
> standby | 0          | false             | 0                 | streaming
>      | async                  | 2021-06-10 20:17:41
> (4 rows)
>
> # promote node 2.
> $ pcp_promote_node -p 11001 -w -r 2
> pcp_promote_node -- Command Successful
>
> [a little time has passed]
>
> # see if node 2 becomes new primary. Note that other standbys are sync
> # with new primary because follow primary command is set.
>
> psql -p 11000 -c "show pool_nodes" test
>  node_id | hostname | port  | status | pg_status | lb_weight |  role   |
> pg_role | select_cnt | load_balance_node | replication_delay |
> replication_state | replication_sync_state | last_status_change
>
> ---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>  0       | /tmp     | 11002 | up     | up        | 0.250000  | standby |
> standby | 0          | false             | 0                 | streaming
>      | async                  | 2021-06-10 20:20:31
>  1       | /tmp     | 11003 | up     | up        | 0.250000  | standby |
> standby | 0          | false             | 0                 | streaming
>      | async                  | 2021-06-10 20:20:31
>  2       | /tmp     | 11004 | up     | up        | 0.250000  | primary |
> primary | 0          | true              | 0                 |
>      |                        | 2021-06-10 20:18:20
>  3       | /tmp     | 11005 | up     | up        | 0.250000  | standby |
> standby | 0          | false             | 0                 | streaming
>      | async                  | 2021-06-10 20:20:31
> (4 rows)
>
> Patch attached.
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
> > Hi Nathan,
> >
> > I have revisited this and am leaning toward the idea: modifying
> > existing pcp command so that we can specify which node to be promoted.
> >
> > My idea is using the detaching node protocol of pcp command. Currently
> > the protocol specifies the node to be detached. I add a new request
> > detail flag bit "REQ_DETAIL_PROMOTE". If the flag is set, the protocol
> > treats the requested node id to be promoted, rather than to be
> > detached. In this case the node to be detached will be the current
> > primary node. The node to be promoted is passed to a failover script
> > as the existing argument "new main node" (the live node (not including
> > current primary node) which has the smallest node id). Existing
> > failover scripts usually treat the "new main node" as the node to be
> > promoted.
> >
> > Attached is the PoC patch for this. After applying the patch, the -g
> > argument (detach gracefully) of pcp_detach_node specifies the
> > "REQ_DETAIL_PROMOTE" flag and the node id argument now represents the
> > node id to be promoted.  For example:
> >
> > # create 4-node cluster.
> > $ pgpool_setup -n 4
> >
> > # primary node is 0.
> > $ psql -p 11000 -c "show pool_nodes" test
> >  node_id | hostname | port  | status | pg_status | lb_weight |  role   |
> pg_role | select_cnt | load_balance_node | replication_delay |
> replication_state | replication_sync_state | last_status_change
> >
> ---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> >  0       | /tmp     | 11002 | up     | up        | 0.250000  | primary |
> primary | 0          | false             | 0                 |
>      |                        | 2021-06-07 11:02:25
> >  1       | /tmp     | 11003 | up     | up        | 0.250000  | standby |
> standby | 0          | false             | 0                 | streaming
>      | async                  | 2021-06-07 11:02:25
> >  2       | /tmp     | 11004 | up     | up        | 0.250000  | standby |
> standby | 0          | true              | 0                 | streaming
>      | async                  | 2021-06-07 11:02:25
> >  3       | /tmp     | 11005 | up     | up        | 0.250000  | standby |
> standby | 0          | false             | 0                 | streaming
>      | async                  | 2021-06-07 11:02:25
> > (4 rows)
> >
> > # promote node 2.
> > pcp_detach_node -g -p 11001 -w 2
> > psql -p 11000 -c "show pool_nodes" test
> >  node_id | hostname | port  | status | pg_status | lb_weight |  role   |
> pg_role | select_cnt | load_balance_node | replication_delay |
> replication_state | replication_sync_state | last_status_change
> >
> ---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> >  0       | /tmp     | 11002 | up     | up        | 0.250000  | standby |
> standby | 0          | false             | 0                 | streaming
>      | async                  | 2021-06-07 11:03:17
> >  1       | /tmp     | 11003 | up     | up        | 0.250000  | standby |
> standby | 0          | false             | 0                 | streaming
>      | async                  | 2021-06-07 11:03:17
> >  2       | /tmp     | 11004 | up     | up        | 0.250000  | primary |
> primary | 0          | true              | 0                 |
>      |                        | 2021-06-07 11:02:43
> >  3       | /tmp     | 11005 | up     | up        | 0.250000  | standby |
> standby | 0          | false             | 0                 | streaming
>      | async                  | 2021-06-07 11:03:27
> > (4 rows)
> >
> > (note that the node 3 may become in down status after the execution of
> > pcp_detach_node but it's because other issue. See [pgpool-hackers:
> > 3915] ERROR: failed to process PCP request at the moment for more
> > details.)
> >
> > One of the merits of this method is, existing failover scripts need
> > not to be changed.  (Of course we could modify existing
> > pcp_promote_node so that it does the job).
> >
> > However I see difficulties with modifying existing pcp commands
> > (either pcp_detach_node and pcp_promote_node). The protocol used by
> > pcp commands are very limited in that the only argument to be able to
> > use is, the node id. So how pcp_detach_node implements -g flag?
> > pcp_detach_node uses two kinds of pcp protocol: one is for without -g
> > and the other is with -g. (I think eventually we should overhaul
> > existing pcp protocol someday to overcome these limitations but it's
> > other story).
> >
> > Probably we need to invent new pcp protocol and use it for this
> > purpose.
> >
> > Comments and suggestions are welcome.
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> > English: http://www.sraoss.co.jp/index_en.php
> > Japanese:http://www.sraoss.co.jp
> >
> >> Hello!
> >>
> >>> On 7/04/2021, at 12:57 AM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> >>>
> >>> Hi,
> >>>
> >>>> Thanks for your time on this so far!
> >>>>
> >>>> Would it be possible to allow these to be changed at runtime without
> a config change + reload? Perhaps a more generic pcp command, to allow any
> “reloadable” configuration item to be changed? I’m not sure how many of
> these there are, and if any of them require deeper changes. Maybe we need a
> 3rd “class” of configuration item, to allow it to be changed without
> changing config.
> >>>
> >>> Not sure. PostgreSQL does not have such a 3rd"class".
> >>
> >> That’s true, though you can modify a special configuration file without
> editing files directly with ALTER SYSTEM and then reload the configuration
> - i.e. you can make changes to the running server without editing
> configuration files directly.
> >>
> >> Pgpool could somewhat mirror postgres functionality here, if there was
> an include directive, then a reload is required. If we enabled
> configuration to be added to such a file with pcp, that would be useful, so
> it could be done from a remote system perhaps?
> >>
> >> Just some thoughts.. these seem like bigger changes, and I’m new here
> :-)
> >>
> >>>> We (and I am sure many others) use config management tools to keep
> our config in sync. In particular, we use Puppet. I have a module which I’m
> hoping to be able to publish soon, in fact.
> >>>>
> >>>> Usually, environments which use config management like this have
> policies to not modify configuration which is managed.
> >>>>
> >>>> Switching to a different backend would mean we have to manually
> update that managed configuration (on the pgpool primary, presumably) and
> reload, and then run pcp_detach_node against the primary backend. In the
> time between updating the configuration, and detaching the node, our
> automation might run and undo our changes which would mean we may get an
> outcome we don’t expect.
> >>>> An alternative would be to update the configuration in our management
> system and deploy it - but in many environments making these changes can be
> quite involved. I don’t think it’s good to require operational changes
> (i.e. “move the primary function to this backend”) to require configuration
> updates.
> >>>>
> >>>> A work-around to this would be to have an include directive in the
> configuration and include a local, non-managed file for these sorts of
> temporary changes, but pgpool doesn’t have this at the moment I believe?
> >>>
> >>> No, pgpool does not have. However I think it would be a good idea to
> >>> have "include" feature like PostgreSQL.
> >>
> >> I would certainly use that. My puppet module writes the configuration
> data twice:
> >> 1) In to the main pgpool.conf file.
> >> 2) In to one of two files, either a “reloadable” config, or a “restart
> required” config - so that I can use puppet notifications for changes to
> those files to trigger either a reload or a restart.
> >> It then does some checks to make sure the pgpool.conf is older than the
> reloadable/restart required config, etc. etc. - it’s quite messy!
> >>
> >> An include directive would make that all a lot easier.. and of course
> would enable this use case.
> >>
> >> --
> >> Nathan Ward
> >>
> >>
> _______________________________________________
> pgpool-hackers mailing list
> pgpool-hackers at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-hackers/attachments/20210610/b0cd6440/attachment-0001.htm>