[pgpool-hackers: 3675] Re: Inconsistency in native replication mode
Tatsuo Ishii
ishii at sraoss.co.jp
Sat Jun 27 17:18:06 JST 2020
>>> (4) Deal with extended query protocol
Done. (along with 030 test addition for extended query protocol case)
>>> (2) regression test
>
> I have added 030 test for this.
>
>> I have committed/pushed this.
>>
>> https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=22930861c4006d230361be3779a84da3d0d24b66
>>
>>> Here's a work-in-progress patch to introduce the Pangea algorythm [1].
>>> I deicded to add new clustering mode called "snapshot isolation mode"
>>> for this. What I have done are:
>>>
>>> - Implement Pangea algorythm for r/w transactions.
>>> - Create docs for the new clustering mode.
>>>
>>> Please note that to use the new clustering mode, PostgreSQL's default
>>> transaction isolation mode must be "REPEATABLE READ" (not READ
>>> COMMITTED nor SERIALIZABLE). At this point I tend to think that the
>>> new feature should be marked as "experimental" since I am not
>>> convinced that the new mode does not have any defect at the when 4.2
>>> is released.
>>>
>>> BTW, I have added new section "Bibliography" to give a credit to
>>> Pangea. However it seems the rendering of it is different from what
>>> PostgreSQL has. Can someone please help?
>>>
>>> "Takeshi Mishima and Hiroshi Nakamura, "Pangea: An Eager Database
>>> Replication Middleware guaranteeing Snapshot Isolation without
>>> modification of Database Servers", Proc. VLDB Conference, Aug. 2009."
>>>
>>> I was expecting something like this:
>>>
>>> "[mishima2009] Pangea: An Eager Database Replication Middleware
>>> guaranteeing Snapshot Isolation without modification of Database
>>> Servers", "Takeshi Mishima and Hiroshi Nakamura, Proc. VLDB
>>> Conference, Aug. 2009."
>>>
>>> What I have not done are:
>>>
>>> (1) Check if the transaction is read only and if so, then load balance
>>> it.
>>> (2) regression test
>>>
>>> (3) How to deal with SERIAL/sequence.
>>>
>>> (4) Deal with extended query protocol
>>>
>>> For (1), maybe we could have following rules:
>>>
>>> - If it's a read only SELECT and it's not inside an explicit
>>> transaction, we can load balance it.
>>>
>>> - If it's a read only SELECT and it's inside an explicit transaction,
>>> we can not load balance it.
>>>
>>> The latter is different from native replication mode.
>>>
>>> For (2) maybe we need new testing framework to give a control the
>>> timing to give queries to backend. If you have an idea, please let me
>>> know.
>>>
>>> For (3) I have no idea. Please help.
>>>
>>> Best regards,
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese:http://www.sraoss.co.jp
>>>
>>>> Hi,
>>>>
>>>> I have studied $subject issue. Suppose we have two sessions S1, S2 and
>>>> two servers N1 (master), N2 (slave). S1 updates table t1 in an
>>>> explicit transaction. S2 read t1. Below is a demonstration how we find
>>>> an inconsistency with native replication mode. "S1:N1 SQL statement"
>>>> denotes an SQL statement is executed on node N1 in session S1. Table
>>>> t1 has an integer column i. There is only on row in t1 and the initial
>>>> value is 0. Time is increasing from top to bottom.
>>>>
>>>> S1/N1: BEGIN;
>>>> S1/N2: BEGIN;
>>>> S1/N1: UPDATE t1 SET i = i + 1; -- i = 1
>>>> S1/N2: UPDATE t1 SET i = i + 1; -- i = 1
>>>> S1/N1: COMMIT;
>>>> S2/N1: SELECT i FROM t1; -- i = 1
>>>> S2/N2: SELECT i FROM t1; -- i = 0
>>>> S1/N2: COMMIT;
>>>>
>>>> So inconsistent values of i of t1 are observed in session S2.
>>>>
>>>> To avoid the inconsistency, we can control the order of concurrent
>>>> COMMIT and the first SQL in a transaction, which acquires the
>>>> snapshot, so that they are never executed concurrently. (This is a
>>>> proposed algorithm in [1]). Note that this requires transactions run
>>>> in REPEATABLE READ (or SERIALIZABLE), not default READ COMMITTED.
>>>>
>>>> Suppose we have a mutex lock M. To execute that first SQL in a
>>>> transaction, Pgpool-II needs to acquire M before it. Also it needs to
>>>> acquire M before issuing COMMIT. After hiring this we have:
>>>>
>>>> S1/N1: BEGIN;
>>>> S1/N2: BEGIN;
>>>> S1/N1: UPDATE t1 SET i = i + 1; -- i = 1
>>>> S1/N2: UPDATE t1 SET i = i + 1; -- i = 1
>>>> S1: try to acquire M and succeeded
>>>> S1/N1: COMMIT;
>>>> S2: try to acquire M and failed
>>>> S1/N2: COMMIT;
>>>> S1: release M
>>>> S2: wake up
>>>> S2/N1: SELECT i FROM t1; -- i = 1
>>>> S2/N2: SELECT i FROM t1; -- i = 1
>>>>
>>>> or:
>>>>
>>>> S1/N1: BEGIN;
>>>> S1/N2: BEGIN;
>>>> S1/N1: UPDATE t1 SET i = i + 1; -- i = 1 in session S1
>>>> S1/N2: UPDATE t1 SET i = i + 1;
>>>> S1: try to acquire M and failed
>>>> S2: try to acquire M and succeeded
>>>> S2/N1: SELECT i FROM t1; -- i = 0 in session S2
>>>> S2/N2: SELECT i FROM t1; -- i = 0 in session S2
>>>> S2: release M
>>>> S1: wake up
>>>> S1/N1: COMMIT;
>>>> S2/N1: COMMIT;
>>>>
>>>> As you see, we can now avoid the inconsistency.
>>>>
>>>> Also [1] claims that only read only transaction can distribute SELECT
>>>> (load balance). Currently Pgpool-II load balance read only SELECT
>>>> whenever possible. i.e. if SELECT is followed by write query, start
>>>> transaction (if no transaction is started).
>>>>
>>>> [1] http://www.vldb.org/pvldb/vol2/vldb09-694.pdf
>>>>
>>>> Comments and suggestions are welcome.
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese:http://www.sraoss.co.jp
>>>> _______________________________________________
>>>> pgpool-hackers mailing list
>>>> pgpool-hackers at pgpool.net
>>>> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>> _______________________________________________
>> pgpool-hackers mailing list
>> pgpool-hackers at pgpool.net
>> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
> _______________________________________________
> pgpool-hackers mailing list
> pgpool-hackers at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
More information about the pgpool-hackers
mailing list