[pgpool-general: 9372] Why do we drop the replication slot?
Adam Blomeke
adam.blomeke at volanno.com
Thu Feb 20 06:05:46 JST 2025
So I have failover and failback working properly on a two-node postgres cluster. I can bring down my primary database, pgpool fails over to the secondary, and I can do pcp_recovery_node on the primary and fail back. Fantastic. My users don't skip a beat. When I bring down my secondary node though, the failover script drops the replication slot. This means that when I go to run pcp_attach_node, the logs note that the system is not in sync with this:
2025-02-19 16:22:07.231: sr_check_worker pid 1076194: LOG: get_query_result failed: status: -2
2025-02-19 16:22:07.231: sr_check_worker pid 1076194: CONTEXT: while checking replication time lag
2025-02-19 16:22:17.262: sr_check_worker pid 1076194: LOG: get_query_result failed: status: -2
2025-02-19 16:22:17.262: sr_check_worker pid 1076194: CONTEXT: while checking replication time lag
If I query pg_replication_slots on the primary database, I get nothing there:
# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_
flush_lsn | wal_status | safe_wal_size | two_phase
-----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+-----------
----------+------------+---------------+-----------
(0 rows)
Now of course the obvious fix is to re-add the replication slot but by dropping the slot, that means we're falling back on the wal_keep_size parameter and hoping that we haven't lost any wal files. If we have, we'll need to do a pcp_recovery_node command. Shouldn't the script keep the replication slot around if the secondary fails and only drop it if the primary is the thing that went down? It seems like I'm not getting the full capability of replication slots, because the whole point of it is to keep the WAL files around until the database in recovery confirms it's got them.
Thanks for the help!
[Volanno]
Adam Blomeke, PSD | Developer/Analyst |
Adam.Blomeke at volanno.com<mailto:Adam.Blomeke at volanno.com> | 202.455.4781 ext. 109
www.volanno.com<http://www.volanno.com/> | Certified WOSB, ISO 9001:2015
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20250219/72778d59/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 7199 bytes
Desc: image002.png
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20250219/72778d59/attachment-0001.png>
More information about the pgpool-general
mailing list