[pgpool-hackers: 3893] Re: Problem with detach_false_primary/follow_primary_command
Tatsuo Ishii
ishii at sraoss.co.jp
Tue May 4 13:09:23 JST 2021
In the previous mail I have explained the problem and proposed a patch
for the issue.
However the original reporter also said the problem will occur in more
complex way if watchdog is enabled.
https://www.pgpool.net/pipermail/pgpool-general/2021-April/007590.html
In summary it seems multiple pgpool nodes perform detach_false_primary
concurrently and this is the cause of the problem. I think there's no
reason to perform detach_false_primary in multiple pgpool nodes
concurrently. Rather we should perform detach_false_primary only on
the leader node. If this is correct, we also should not perform
detach_false_primary if the quorum is absent because there's no leader
if the quorum is absent. Attached is the patch to introduce the check
in addition to the v2 patch.
I would like to hear opinion from other pgpool developers on that
whether we should apply the v3 patch to existing branches. I am asking
because currently we perform detach_false_primary even if the quorum
is absent and the change may be "change of user visible behavior"
which we usually avoid on stable branches. However the current
detach_false_primary apparently does not work on the environment where
watchdog is enabled, I think patching to back branches are exceptionally
reasonable choice.
Also I have added the regression test patch.
> In the posting:
>
> [pgpool-general: 7525] Strange behavior on switchover with detach_false_primary enabled
>
> it is reported that detach_false_primary and follow_primary_command
> could conflict each other and pgpool goes into unwanted state. We can
> reproduce the issue by using pgpool_setup to create 3 node
> configuration.
>
> $ pgpool_setup -n 3
>
> echo "detach_false_primary" >> etc/pgpool.conf
> echo "sr_check_period = 1" >> etc/pgpool.conf
>
> The latter may not be mandatory but making the streaming replication
> check frequently will reliably reproduce the problem because
> detach_false_primary is executed in the streaming replication check
> process.
>
> The initial state is as follows:
>
> psql -p 11000 -c "show pool_nodes" test
> node_id | hostname | port | status | pg_status | lb_weight | role | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
> ---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> 0 | /tmp | 11002 | up | up | 0.333333 | primary | primary | 0 | true | 0 | | | 2021-05-04 11:12:01
> 1 | /tmp | 11003 | up | up | 0.333333 | standby | standby | 0 | false | 0 | streaming | async | 2021-05-04 11:12:01
> 2 | /tmp | 11004 | up | up | 0.333333 | standby | standby | 0 | false | 0 | streaming | async | 2021-05-04 11:12:01
> (3 rows)
>
> Execute pcp_detatch_node against node 0.
>
> $ pcp_detach_node -w -p 11001 0
>
> This will let the primary be in down status and this will promote node 1.
>
> 2021-05-04 12:12:14: pcp_child pid 31449: LOG: received degenerate backend request for node_id: 0 from pid [31449]
> 2021-05-04 12:12:14: main pid 31221: LOG: Pgpool-II parent process has received failover request
> 2021-05-04 12:12:14: main pid 31221: LOG: starting degeneration. shutdown host /tmp(11002)
> 2021-05-04 12:12:14: pcp_main pid 31260: LOG: PCP process with pid: 31449 exit with SUCCESS.
> 2021-05-04 12:12:14: pcp_main pid 31260: LOG: PCP process with pid: 31449 exits with status 0
> 2021-05-04 12:12:14: main pid 31221: LOG: Restart all children
> 2021-05-04 12:12:14: main pid 31221: LOG: execute command: /home/t-ishii/work/Pgpool-II/current/x/etc/failover.sh 0 /tmp 11002 /home/t-ishii/work/Pgpool-II/current/x/data0 1 0 /tmp 0 11003 /home/t-ishii/work/Pgpool-II/current/x/data1
>
> However detach_false_primary found that the just promoted node 1 is
> not good because it does not have any follower standby node because
> follow_primary_command did not completed yet.
>
> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG: verify_backend_node_status: primary 1 does not connect to standby 2
> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG: verify_backend_node_status: primary 1 owns only 0 standbys out of 1
> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG: pgpool_worker_child: invalid node found 1
>
> And detach_false_primary sent failover request for node 1.
>
> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG: received degenerate backend request for node_id: 1 from pid [31261]
>
> Moreover every 1 second detach_false_primary tries to detach node 1.
>
> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG: verify_backend_node_status: primary 1 does not connect to standby 2
> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG: verify_backend_node_status: primary 1 owns only 0 standbys out of 1
> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG: pgpool_worker_child: invalid node found 1
> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG: received degenerate backend request for node_id: 1 from pid [31261]
>
> The confuses the whole follow_primary_command and in the end we have:
>
> psql -p 11000 -c "show pool_nodes" test
> node_id | hostname | port | status | pg_status | lb_weight | role | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
> ---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> 0 | /tmp | 11002 | down | down | 0.333333 | standby | unknown | 0 | false | 0 | | | 2021-05-04 12:12:16
> 1 | /tmp | 11003 | up | up | 0.333333 | standby | standby | 0 | false | 0 | | | 2021-05-04 12:22:28
> 2 | /tmp | 11004 | up | up | 0.333333 | standby | standby | 0 | true | 0 | | | 2021-05-04 12:22:28
> (3 rows)
>
> Of course this is totally unwanted result.
>
> I think the root cause of the problem is, detach_false_primary and
> follow_primary_command are allowed to run concurrently. To solve the
> problem we need to have a lock so that if detach_false_primary already
> runs, follow_primary_command should wait for it's completion or vice
> versa.
>
> For this purpose I propose attached patch
> detach_false_primary_v2.diff. In the patch new function
> pool_acquire_follow_primary_lock(bool block) and
> pool_release_follow_primary_lock(void) are introduced. They are
> responsible for acquiring or releasing the lock. There are 3 places
> where those functions are used:
>
> 1) find_primary_node
>
> This function is called upon startup and failover in the main pgpool
> process to find new primary node.
>
> 2) failover
>
> This function is called in the follow_primary_command subprocess
> forked off by pgpool main process to execute follow_primary_command
> script. The lock should be help until all follow_primary_command are
> completed.
>
> 3) streaming replication check
>
> Before starting verify_backend_node, which is the work horse of
> detach_false_primary, the lock must be acquired. If it fails, just
> skip the streaming replication check cycle.
>
>
> I and the user who made the initial report confirmed that tha patch
> works well.
>
> Unfortunately the story is not the all. However the mail is already
> too long. I will continue to the next mail.
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: detach_false_primary_v3.diff
Type: text/x-patch
Size: 10345 bytes
Desc: not available
URL: <http://www.pgpool.net/pipermail/pgpool-hackers/attachments/20210504/b6b3c514/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: detach_false_primary_test_V4_2_v2.diff
Type: text/x-patch
Size: 1518 bytes
Desc: not available
URL: <http://www.pgpool.net/pipermail/pgpool-hackers/attachments/20210504/b6b3c514/attachment-0003.bin>
More information about the pgpool-hackers
mailing list