[pgpool-hackers: 3554] Re: Build farm failure starting on 2020/3/15 in master branch

Tatsuo Ishii ishii at sraoss.co.jp
Mon Mar 16 11:31:27 JST 2020


> We have experienced massive build farm failure starting on 2020/3/15
> in master branch.
> 
> From: buildfarm at pgpool.net
> Subject: [pgpool-buildfarm: 554] pgpool-II buildfarm results CentOS8
> Date: Sun, 15 Mar 2020 11:50:21 +0900
> Message-ID: <5e6d97ed.+sRtHRaZxHljR39S%buildfarm at pgpool.net>
> 
>> =========================================================================
>> * master  PostgreSQL 11  CentOS8
>> testing 001.load_balance...failed.
>> testing 003.failover...failed.
> [snip]
> 
> I have looked into this and confirmed that the cause is this commit:
> 
> ----------------------------------------------------------------
> Subject: [pgpool-committers: 6625] pgpool: Add support for SSL CRL (Certificate Revocation List).
> From: Tatsuo Ishii <ishii at sraoss.co.jp>
> To: pgpool-committers at pgpool.net
> Date: Sat, 14 Mar 2020 03:21:16 +0000
> Sender: pgpool-committers-bounces at pgpool.net
> X-Mew: tab/spc characters on Subject: are simplified.
> 
> Add support for SSL CRL (Certificate Revocation List).
> ----------------------------------------------------------------
> 
> Also I have found that actual cause of the failure is not this
> commit. In fact any attempt to add new configuration parameter could
> cause the failure. The commit just hit a hidden bug.
> 
> I am going to explain why the build farm failure happened.
> 
> 1) config process (src/config/pool_config_variables.c) sorts each
> config parameters based on its string name length (see
> sort_config_vars()).
> 
> 2) A new parameter added by the commit.
> 
> 3) Since the config parameter "backend_flag*" has the same string
> length (12) as the new parameter "ssl_crl_file", the order of
> processing of ALLOW_TO_FAILOVER and ALWAYS_MASTER has been changed
> since the commit. before: ALWAYS_MASTER, ALLOW_TO_FAILOVER, now:
> ALLOW_TO_FAILOVER, ALWAYS_MASTER.
> 
> 4) the built-in default values for backend_flag is ALLOW_TO_FAILOVER
> and ALWAYS_MASTER.
> 
> 5) Before the commit, the order of processing the backend_flag was
> ALWAYS_MASTER, then replaced by ALLOW_TO_FAILOVER.
> 
> 6) After the commit, ALLOW_TO_FAILOVER, then replaced by
> ALWAYS_MASTER. So the result of the flag is now ALWAYS_MASTER.
> 
> 7) Since both backend 0 and backend 1 now has ALWAYS_MASTER flag,
> pgpool is confused and mistakenly sets backend 1 as primary. So
> DDL/DML are sent to backend 1 and failed. This makes almost all
> regression tests failed (only surviving regression tests use only 1
> backend and are not affected by the problem).
> 
> So what should we do?
> 
> I think current implementation of pgpool configure processing of
> backend flags (src/config/pool_config_variable.c) has multiple issues.
> 
> 1) The default value for backend flag is ALWAYS_MASTER. This should be
> "" (empty string).
> 
> 2) The final value of the backend flag is the last default for the
> flag. This is plain wrong (see BackendFlagsAssignFunc()). The result
> value should be OR'ed value of each default value since backend_flag
> is a bit data.
> 
> I am going to fix the issue as soon as possible.

Done.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


More information about the pgpool-hackers mailing list