[pgpool-hackers: 3938] Re: Problem with detach_false_primary/follow_primary_command

Muhammad Usama m.usama at gmail.com
Fri Jun 18 16:42:15 JST 2021


Hi Ishii-San,

Thanks for confirmation. I will commit the patch after doing some more
testing.

Best regards
Muhammad Usama

On Thu, Jun 17, 2021 at 6:24 PM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> Hi Usama,
>
> Thank you for updating the patch. The patch applied cleanly and all
> the regression tests including 018.detach_primary passed.
>
> > Hi Ishii-San
> >
> >
> >
> > On Wed, Jun 16, 2021 at 3:15 PM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> >
> >> Hi Usama,
> >>
> >> Unfortunately the patch did not apply cleanly on current master
> >> branch:
> >>
> >
> > Sorry, Appearently I had'nt created a patch from current master head.
> > Attached is the rebaed version
> >
> >>
> >> $ git apply ~/wd_coordinating_follow_and_detach__primary.patch
> >> error: patch failed: src/include/pool.h:426
> >> error: src/include/pool.h: patch does not apply
> >>
> >> So I have not actually tested the patch but it seems the idea of
> >> locking watchdog level is more robust than my idea (executing false
> >> primary check only on the coordinator node).
> >>
> >
> > Thanks for the confirmation. I have confirmed the regression is fine with
> > the patch
> > but I think I need some more testing before I can commit it.
> >
> > Best regards
> > Muhammad Usama
> >
> >
> >> Best regards,
> >> --
> >> Tatsuo Ishii
> >> SRA OSS, Inc. Japan
> >> English: http://www.sraoss.co.jp/index_en.php
> >> Japanese:http://www.sraoss.co.jp
> >>
> >> > Hi Ishii-San
> >> >
> >> > As discussed over the slack. I have cooked up a POC patch for
> >> implementing
> >> > the
> >> > follow_primary locking over the watchdog channel.
> >> >
> >> > The idea is just before executing the follow_primary during the
> failover
> >> > process
> >> > we just direct all standby watchdog nodes to acquire the same lock on
> >> their
> >> > respective
> >> > nodes, so that they stop the false primary detection during the period
> >> when
> >> > the
> >> > follow_primary is being executed on the watchdog coordinator node.
> >> >
> >> > Moreover to keep the watchdog process blocked on waiting for the lock
> I
> >> > have introduced
> >> > the pending remote lock mechanism, so that remote locks can be
> acquired
> >> in
> >> > the background
> >> > after the completion of the inflight replication checks.
> >> >
> >> > Finally I have removed the REQ_DETAIL_CONFIRMED flag from
> >> > degenerate_backend_set()
> >> > request that gets issued to detach the false primary, That means all
> >> quorum
> >> > and consensus rules
> >> > will needed to be satisfied for the detach to happen.
> >> >
> >> > I haven't done a rigorous testing or regression with the patch and
> >> > sharing the initial version with you
> >> > to get your consensus on the basic idea and design.
> >> >
> >> > Can you kindly take a look if you agree with the approach.
> >> >
> >> > Thanks
> >> > Best regards
> >> > Muhammad Usama
> >> >
> >> >
> >> > On Fri, May 7, 2021 at 9:47 AM Tatsuo Ishii <ishii at sraoss.co.jp>
> wrote:
> >> >
> >> >> I am going to commit/push the patches to master down to 4.0 stable
> >> >> (detach_false_primary was introduced in 4.0) branches if there's no
> >> >> objection.
> >> >>
> >> >> Best regards,
> >> >> --
> >> >> Tatsuo Ishii
> >> >> SRA OSS, Inc. Japan
> >> >> English: http://www.sraoss.co.jp/index_en.php
> >> >> Japanese:http://www.sraoss.co.jp
> >> >>
> >> >> From: Tatsuo Ishii <ishii at sraoss.co.jp>
> >> >> Subject: [pgpool-hackers: 3893] Re: Problem with
> >> >> detach_false_primary/follow_primary_command
> >> >> Date: Tue, 04 May 2021 13:09:23 +0900 (JST)
> >> >> Message-ID: <20210504.130923.644768896074013686.t-ishii at gmail.com>
> >> >>
> >> >> > In the previous mail I have explained the problem and proposed a
> patch
> >> >> > for the issue.
> >> >> >
> >> >> > However the original reporter also said the problem will occur in
> more
> >> >> > complex way if watchdog is enabled.
> >> >> >
> >> >> >
> >> https://www.pgpool.net/pipermail/pgpool-general/2021-April/007590.html
> >> >> >
> >> >> > In summary it seems multiple pgpool nodes perform
> detach_false_primary
> >> >> > concurrently and this is the cause of the problem. I think there's
> no
> >> >> > reason to perform detach_false_primary in multiple pgpool nodes
> >> >> > concurrently. Rather we should perform detach_false_primary only on
> >> >> > the leader node. If this is correct, we also should not perform
> >> >> > detach_false_primary if the quorum is absent because there's no
> leader
> >> >> > if the quorum is absent. Attached is the patch to introduce the
> check
> >> >> > in addition to the v2 patch.
> >> >> >
> >> >> > I would like to hear opinion from other pgpool developers on that
> >> >> > whether we should apply the v3 patch to existing branches. I am
> asking
> >> >> > because currently we perform detach_false_primary even if the
> quorum
> >> >> > is absent and the change may be "change of user visible behavior"
> >> >> > which we usually avoid on stable branches. However the current
> >> >> > detach_false_primary apparently does not work on the environment
> where
> >> >> > watchdog is enabled, I think patching to back branches are
> >> exceptionally
> >> >> > reasonable choice.
> >> >> >
> >> >> > Also I have added the regression test patch.
> >> >> >
> >> >> >> In the posting:
> >> >> >>
> >> >> >> [pgpool-general: 7525] Strange behavior on switchover with
> >> >> detach_false_primary enabled
> >> >> >>
> >> >> >> it is reported that detach_false_primary and
> follow_primary_command
> >> >> >> could conflict each other and pgpool goes into unwanted state. We
> can
> >> >> >> reproduce the issue by using pgpool_setup to create 3 node
> >> >> >> configuration.
> >> >> >>
> >> >> >> $ pgpool_setup -n 3
> >> >> >>
> >> >> >> echo "detach_false_primary" >> etc/pgpool.conf
> >> >> >> echo "sr_check_period = 1" >> etc/pgpool.conf
> >> >> >>
> >> >> >> The latter may not be mandatory but making the streaming
> replication
> >> >> >> check frequently will reliably reproduce the problem because
> >> >> >> detach_false_primary is executed in the streaming replication
> check
> >> >> >> process.
> >> >> >>
> >> >> >> The initial state is as follows:
> >> >> >>
> >> >> >> psql -p 11000 -c "show pool_nodes" test
> >> >> >>  node_id | hostname | port  | status | pg_status | lb_weight |
> role
> >> >>  | pg_role | select_cnt | load_balance_node | replication_delay |
> >> >> replication_state | replication_sync_state | last_status_change
> >> >> >>
> >> >>
> >>
> ---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> >> >> >>  0       | /tmp     | 11002 | up     | up        | 0.333333  |
> >> primary
> >> >> | primary | 0          | true              | 0                 |
> >> >>        |                        | 2021-05-04 11:12:01
> >> >> >>  1       | /tmp     | 11003 | up     | up        | 0.333333  |
> >> standby
> >> >> | standby | 0          | false             | 0                 |
> >> streaming
> >> >>        | async                  | 2021-05-04 11:12:01
> >> >> >>  2       | /tmp     | 11004 | up     | up        | 0.333333  |
> >> standby
> >> >> | standby | 0          | false             | 0                 |
> >> streaming
> >> >>        | async                  | 2021-05-04 11:12:01
> >> >> >> (3 rows)
> >> >> >>
> >> >> >> Execute pcp_detatch_node against node 0.
> >> >> >>
> >> >> >> $ pcp_detach_node -w -p 11001 0
> >> >> >>
> >> >> >> This will let the primary be in down status and this will promote
> >> node
> >> >> 1.
> >> >> >>
> >> >> >> 2021-05-04 12:12:14: pcp_child pid 31449: LOG:  received
> degenerate
> >> >> backend request for node_id: 0 from pid [31449]
> >> >> >> 2021-05-04 12:12:14: main pid 31221: LOG:  Pgpool-II parent
> process
> >> has
> >> >> received failover request
> >> >> >> 2021-05-04 12:12:14: main pid 31221: LOG:  starting degeneration.
> >> >> shutdown host /tmp(11002)
> >> >> >> 2021-05-04 12:12:14: pcp_main pid 31260: LOG:  PCP process with
> pid:
> >> >> 31449 exit with SUCCESS.
> >> >> >> 2021-05-04 12:12:14: pcp_main pid 31260: LOG:  PCP process with
> pid:
> >> >> 31449 exits with status 0
> >> >> >> 2021-05-04 12:12:14: main pid 31221: LOG:  Restart all children
> >> >> >> 2021-05-04 12:12:14: main pid 31221: LOG:  execute command:
> >> >> /home/t-ishii/work/Pgpool-II/current/x/etc/failover.sh 0 /tmp 11002
> >> >> /home/t-ishii/work/Pgpool-II/current/x/data0 1 0 /tmp 0 11003
> >> >> /home/t-ishii/work/Pgpool-II/current/x/data1
> >> >> >>
> >> >> >> However detach_false_primary found that the just promoted node 1
> is
> >> >> >> not good because it does not have any follower standby node
> because
> >> >> >> follow_primary_command did not completed yet.
> >> >> >>
> >> >> >> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG:
> >> >> verify_backend_node_status: primary 1 does not connect to standby 2
> >> >> >> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG:
> >> >> verify_backend_node_status: primary 1 owns only 0 standbys out of 1
> >> >> >> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG:
> >> >> pgpool_worker_child: invalid node found 1
> >> >> >>
> >> >> >> And detach_false_primary sent failover request for node 1.
> >> >> >>
> >> >> >> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG:  received
> >> >> degenerate backend request for node_id: 1 from pid [31261]
> >> >> >>
> >> >> >> Moreover every 1 second detach_false_primary tries to detach node
> 1.
> >> >> >>
> >> >> >> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG:
> >> >> verify_backend_node_status: primary 1 does not connect to standby 2
> >> >> >> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG:
> >> >> verify_backend_node_status: primary 1 owns only 0 standbys out of 1
> >> >> >> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG:
> >> >> pgpool_worker_child: invalid node found 1
> >> >> >> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG:  received
> >> >> degenerate backend request for node_id: 1 from pid [31261]
> >> >> >>
> >> >> >> The confuses the whole follow_primary_command and in the end we
> have:
> >> >> >>
> >> >> >> psql -p 11000 -c "show pool_nodes" test
> >> >> >>  node_id | hostname | port  | status | pg_status | lb_weight |
> role
> >> >>  | pg_role | select_cnt | load_balance_node | replication_delay |
> >> >> replication_state | replication_sync_state | last_status_change
> >> >> >>
> >> >>
> >>
> ---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> >> >> >>  0       | /tmp     | 11002 | down   | down      | 0.333333  |
> >> standby
> >> >> | unknown | 0          | false             | 0                 |
> >> >>        |                        | 2021-05-04 12:12:16
> >> >> >>  1       | /tmp     | 11003 | up     | up        | 0.333333  |
> >> standby
> >> >> | standby | 0          | false             | 0                 |
> >> >>        |                        | 2021-05-04 12:22:28
> >> >> >>  2       | /tmp     | 11004 | up     | up        | 0.333333  |
> >> standby
> >> >> | standby | 0          | true              | 0                 |
> >> >>        |                        | 2021-05-04 12:22:28
> >> >> >> (3 rows)
> >> >> >>
> >> >> >> Of course this is totally unwanted result.
> >> >> >>
> >> >> >> I think the root cause of the problem is, detach_false_primary and
> >> >> >> follow_primary_command are allowed to run concurrently. To solve
> the
> >> >> >> problem we need to have a lock so that if detach_false_primary
> >> already
> >> >> >> runs, follow_primary_command should wait for it's completion or
> vice
> >> >> >> versa.
> >> >> >>
> >> >> >> For this purpose I propose attached patch
> >> >> >> detach_false_primary_v2.diff. In the patch new function
> >> >> >> pool_acquire_follow_primary_lock(bool block) and
> >> >> >> pool_release_follow_primary_lock(void) are introduced. They are
> >> >> >> responsible for acquiring or releasing the lock. There are 3
> places
> >> >> >> where those functions are used:
> >> >> >>
> >> >> >> 1) find_primary_node
> >> >> >>
> >> >> >> This function is called upon startup and failover in the main
> pgpool
> >> >> >> process to find new primary node.
> >> >> >>
> >> >> >> 2) failover
> >> >> >>
> >> >> >> This function is called in the follow_primary_command subprocess
> >> >> >> forked off by pgpool main process to execute
> follow_primary_command
> >> >> >> script. The lock should be help until all follow_primary_command
> are
> >> >> >> completed.
> >> >> >>
> >> >> >> 3) streaming replication check
> >> >> >>
> >> >> >> Before starting verify_backend_node, which is the work horse of
> >> >> >> detach_false_primary, the lock must be acquired. If it fails, just
> >> >> >> skip the streaming replication check cycle.
> >> >> >>
> >> >> >>
> >> >> >> I and the user who made the initial report confirmed that tha
> patch
> >> >> >> works well.
> >> >> >>
> >> >> >> Unfortunately the story is not the all. However the mail is
> already
> >> >> >> too long. I will continue to the next mail.
> >> >> >>
> >> >> >> Best regards,
> >> >> >> --
> >> >> >> Tatsuo Ishii
> >> >> >> SRA OSS, Inc. Japan
> >> >> >> English: http://www.sraoss.co.jp/index_en.php
> >> >> >> Japanese:http://www.sraoss.co.jp
> >> >> _______________________________________________
> >> >> pgpool-hackers mailing list
> >> >> pgpool-hackers at pgpool.net
> >> >> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
> >> >>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-hackers/attachments/20210618/929bcbfa/attachment-0001.htm>


More information about the pgpool-hackers mailing list