[pgpool-general: 7506] Possible pgpool 4.1.4 failover auto_failback race condition
Nathan Ward
lists+pgpool at daork.net
Wed Apr 14 14:54:13 JST 2021
Hi,
I believe I’ve found a race condition with auto_failback.
In my test environment I have 3 servers each running both pgpool and postgres.
I simulate a network failure with iptables rules on one node.
I start the test with the following state:
pgpool primary: node 2
postgres primary: node 0
When I fail node 0, in order to trigger failover with follow_master (4.1.x still), I find that most of the time node 2 is reattached before follow_master gets a chance to run for that node. It is of course set to CON_DOWN when the failover is triggered, and I would expect it to stay in that state until follow_master reattaches it.
I believe, though I’m not 100% certain, that sometimes this comes from node 1.
Is this likely a configuration problem, or, is this a bug of some kind? I had a quick look at the code, and don’t see any changes that would impact this since 4.1.4 - but I am of course happy to be wrong about that !
We have the following set:
auto_failback = on
auto_failback_interval = 60
--
Nathan Ward
More information about the pgpool-general
mailing list