[pgpool-hackers: 3896] Re: Problem with detach_false_primary/follow_primary_command

Fri May 7 13:47:14 JST 2021

I am going to commit/push the patches to master down to 4.0 stable
(detach_false_primary was introduced in 4.0) branches if there's no
objection.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

From: Tatsuo Ishii <ishii at sraoss.co.jp>
Subject: [pgpool-hackers: 3893] Re: Problem with detach_false_primary/follow_primary_command
Date: Tue, 04 May 2021 13:09:23 +0900 (JST)
Message-ID: <20210504.130923.644768896074013686.t-ishii at gmail.com>

> In the previous mail I have explained the problem and proposed a patch
> for the issue.
> 
> However the original reporter also said the problem will occur in more
> complex way if watchdog is enabled.
> 
> https://www.pgpool.net/pipermail/pgpool-general/2021-April/007590.html
> 
> In summary it seems multiple pgpool nodes perform detach_false_primary
> concurrently and this is the cause of the problem. I think there's no
> reason to perform detach_false_primary in multiple pgpool nodes
> concurrently. Rather we should perform detach_false_primary only on
> the leader node. If this is correct, we also should not perform
> detach_false_primary if the quorum is absent because there's no leader
> if the quorum is absent. Attached is the patch to introduce the check
> in addition to the v2 patch.
> 
> I would like to hear opinion from other pgpool developers on that
> whether we should apply the v3 patch to existing branches. I am asking
> because currently we perform detach_false_primary even if the quorum
> is absent and the change may be "change of user visible behavior"
> which we usually avoid on stable branches. However the current
> detach_false_primary apparently does not work on the environment where
> watchdog is enabled, I think patching to back branches are exceptionally
> reasonable choice.
> 
> Also I have added the regression test patch.
> 
>> In the posting:
>> 
>> [pgpool-general: 7525] Strange behavior on switchover with detach_false_primary enabled
>> 
>> it is reported that detach_false_primary and follow_primary_command
>> could conflict each other and pgpool goes into unwanted state. We can
>> reproduce the issue by using pgpool_setup to create 3 node
>> configuration.
>> 
>> $ pgpool_setup -n 3
>> 
>> echo "detach_false_primary" >> etc/pgpool.conf
>> echo "sr_check_period = 1" >> etc/pgpool.conf
>> 
>> The latter may not be mandatory but making the streaming replication
>> check frequently will reliably reproduce the problem because
>> detach_false_primary is executed in the streaming replication check
>> process.
>> 
>> The initial state is as follows:
>> 
>> psql -p 11000 -c "show pool_nodes" test
>>  node_id | hostname | port  | status | pg_status | lb_weight |  role   | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
>> ---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>>  0       | /tmp     | 11002 | up     | up        | 0.333333  | primary | primary | 0          | true              | 0                 |                   |                        | 2021-05-04 11:12:01
>>  1       | /tmp     | 11003 | up     | up        | 0.333333  | standby | standby | 0          | false             | 0                 | streaming         | async                  | 2021-05-04 11:12:01
>>  2       | /tmp     | 11004 | up     | up        | 0.333333  | standby | standby | 0          | false             | 0                 | streaming         | async                  | 2021-05-04 11:12:01
>> (3 rows)
>> 
>> Execute pcp_detatch_node against node 0.
>> 
>> $ pcp_detach_node -w -p 11001 0
>> 
>> This will let the primary be in down status and this will promote node 1.
>> 
>> 2021-05-04 12:12:14: pcp_child pid 31449: LOG:  received degenerate backend request for node_id: 0 from pid [31449]
>> 2021-05-04 12:12:14: main pid 31221: LOG:  Pgpool-II parent process has received failover request
>> 2021-05-04 12:12:14: main pid 31221: LOG:  starting degeneration. shutdown host /tmp(11002)
>> 2021-05-04 12:12:14: pcp_main pid 31260: LOG:  PCP process with pid: 31449 exit with SUCCESS.
>> 2021-05-04 12:12:14: pcp_main pid 31260: LOG:  PCP process with pid: 31449 exits with status 0
>> 2021-05-04 12:12:14: main pid 31221: LOG:  Restart all children
>> 2021-05-04 12:12:14: main pid 31221: LOG:  execute command: /home/t-ishii/work/Pgpool-II/current/x/etc/failover.sh 0 /tmp 11002 /home/t-ishii/work/Pgpool-II/current/x/data0 1 0 /tmp 0 11003 /home/t-ishii/work/Pgpool-II/current/x/data1
>> 
>> However detach_false_primary found that the just promoted node 1 is
>> not good because it does not have any follower standby node because
>> follow_primary_command did not completed yet.
>> 
>> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG:  verify_backend_node_status: primary 1 does not connect to standby 2
>> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG:  verify_backend_node_status: primary 1 owns only 0 standbys out of 1
>> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG:  pgpool_worker_child: invalid node found 1
>> 
>> And detach_false_primary sent failover request for node 1.
>> 
>> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG:  received degenerate backend request for node_id: 1 from pid [31261]
>> 
>> Moreover every 1 second detach_false_primary tries to detach node 1.
>> 
>> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG:  verify_backend_node_status: primary 1 does not connect to standby 2
>> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG:  verify_backend_node_status: primary 1 owns only 0 standbys out of 1
>> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG:  pgpool_worker_child: invalid node found 1
>> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG:  received degenerate backend request for node_id: 1 from pid [31261]
>> 
>> The confuses the whole follow_primary_command and in the end we have:
>> 
>> psql -p 11000 -c "show pool_nodes" test
>>  node_id | hostname | port  | status | pg_status | lb_weight |  role   | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
>> ---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>>  0       | /tmp     | 11002 | down   | down      | 0.333333  | standby | unknown | 0          | false             | 0                 |                   |                        | 2021-05-04 12:12:16
>>  1       | /tmp     | 11003 | up     | up        | 0.333333  | standby | standby | 0          | false             | 0                 |                   |                        | 2021-05-04 12:22:28
>>  2       | /tmp     | 11004 | up     | up        | 0.333333  | standby | standby | 0          | true              | 0                 |                   |                        | 2021-05-04 12:22:28
>> (3 rows)
>> 
>> Of course this is totally unwanted result.
>> 
>> I think the root cause of the problem is, detach_false_primary and
>> follow_primary_command are allowed to run concurrently. To solve the
>> problem we need to have a lock so that if detach_false_primary already
>> runs, follow_primary_command should wait for it's completion or vice
>> versa.
>> 
>> For this purpose I propose attached patch
>> detach_false_primary_v2.diff. In the patch new function
>> pool_acquire_follow_primary_lock(bool block) and
>> pool_release_follow_primary_lock(void) are introduced. They are
>> responsible for acquiring or releasing the lock. There are 3 places
>> where those functions are used:
>> 
>> 1) find_primary_node
>> 
>> This function is called upon startup and failover in the main pgpool
>> process to find new primary node.
>> 
>> 2) failover
>> 
>> This function is called in the follow_primary_command subprocess
>> forked off by pgpool main process to execute follow_primary_command
>> script. The lock should be help until all follow_primary_command are
>> completed.
>> 
>> 3) streaming replication check
>> 
>> Before starting verify_backend_node, which is the work horse of
>> detach_false_primary, the lock must be acquired. If it fails, just
>> skip the streaming replication check cycle.
>> 
>> 
>> I and the user who made the initial report confirmed that tha patch
>> works well.
>> 
>> Unfortunately the story is not the all. However the mail is already
>> too long. I will continue to the next mail.
>> 
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp