[pgpool-hackers: 3662] Re: Inconsistency in native replication mode
Tatsuo Ishii
ishii at sraoss.co.jp
Sat Jun 20 10:09:14 JST 2020
>> (2) regression test
I have added 030 test for this.
> I have committed/pushed this.
>
> https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=22930861c4006d230361be3779a84da3d0d24b66
>
>> Here's a work-in-progress patch to introduce the Pangea algorythm [1].
>> I deicded to add new clustering mode called "snapshot isolation mode"
>> for this. What I have done are:
>>
>> - Implement Pangea algorythm for r/w transactions.
>> - Create docs for the new clustering mode.
>>
>> Please note that to use the new clustering mode, PostgreSQL's default
>> transaction isolation mode must be "REPEATABLE READ" (not READ
>> COMMITTED nor SERIALIZABLE). At this point I tend to think that the
>> new feature should be marked as "experimental" since I am not
>> convinced that the new mode does not have any defect at the when 4.2
>> is released.
>>
>> BTW, I have added new section "Bibliography" to give a credit to
>> Pangea. However it seems the rendering of it is different from what
>> PostgreSQL has. Can someone please help?
>>
>> "Takeshi Mishima and Hiroshi Nakamura, "Pangea: An Eager Database
>> Replication Middleware guaranteeing Snapshot Isolation without
>> modification of Database Servers", Proc. VLDB Conference, Aug. 2009."
>>
>> I was expecting something like this:
>>
>> "[mishima2009] Pangea: An Eager Database Replication Middleware
>> guaranteeing Snapshot Isolation without modification of Database
>> Servers", "Takeshi Mishima and Hiroshi Nakamura, Proc. VLDB
>> Conference, Aug. 2009."
>>
>> What I have not done are:
>>
>> (1) Check if the transaction is read only and if so, then load balance
>> it.
>> (2) regression test
>>
>> (3) How to deal with SERIAL/sequence.
>>
>> (4) Deal with extended query protocol
>>
>> For (1), maybe we could have following rules:
>>
>> - If it's a read only SELECT and it's not inside an explicit
>> transaction, we can load balance it.
>>
>> - If it's a read only SELECT and it's inside an explicit transaction,
>> we can not load balance it.
>>
>> The latter is different from native replication mode.
>>
>> For (2) maybe we need new testing framework to give a control the
>> timing to give queries to backend. If you have an idea, please let me
>> know.
>>
>> For (3) I have no idea. Please help.
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>
>>> Hi,
>>>
>>> I have studied $subject issue. Suppose we have two sessions S1, S2 and
>>> two servers N1 (master), N2 (slave). S1 updates table t1 in an
>>> explicit transaction. S2 read t1. Below is a demonstration how we find
>>> an inconsistency with native replication mode. "S1:N1 SQL statement"
>>> denotes an SQL statement is executed on node N1 in session S1. Table
>>> t1 has an integer column i. There is only on row in t1 and the initial
>>> value is 0. Time is increasing from top to bottom.
>>>
>>> S1/N1: BEGIN;
>>> S1/N2: BEGIN;
>>> S1/N1: UPDATE t1 SET i = i + 1; -- i = 1
>>> S1/N2: UPDATE t1 SET i = i + 1; -- i = 1
>>> S1/N1: COMMIT;
>>> S2/N1: SELECT i FROM t1; -- i = 1
>>> S2/N2: SELECT i FROM t1; -- i = 0
>>> S1/N2: COMMIT;
>>>
>>> So inconsistent values of i of t1 are observed in session S2.
>>>
>>> To avoid the inconsistency, we can control the order of concurrent
>>> COMMIT and the first SQL in a transaction, which acquires the
>>> snapshot, so that they are never executed concurrently. (This is a
>>> proposed algorithm in [1]). Note that this requires transactions run
>>> in REPEATABLE READ (or SERIALIZABLE), not default READ COMMITTED.
>>>
>>> Suppose we have a mutex lock M. To execute that first SQL in a
>>> transaction, Pgpool-II needs to acquire M before it. Also it needs to
>>> acquire M before issuing COMMIT. After hiring this we have:
>>>
>>> S1/N1: BEGIN;
>>> S1/N2: BEGIN;
>>> S1/N1: UPDATE t1 SET i = i + 1; -- i = 1
>>> S1/N2: UPDATE t1 SET i = i + 1; -- i = 1
>>> S1: try to acquire M and succeeded
>>> S1/N1: COMMIT;
>>> S2: try to acquire M and failed
>>> S1/N2: COMMIT;
>>> S1: release M
>>> S2: wake up
>>> S2/N1: SELECT i FROM t1; -- i = 1
>>> S2/N2: SELECT i FROM t1; -- i = 1
>>>
>>> or:
>>>
>>> S1/N1: BEGIN;
>>> S1/N2: BEGIN;
>>> S1/N1: UPDATE t1 SET i = i + 1; -- i = 1 in session S1
>>> S1/N2: UPDATE t1 SET i = i + 1;
>>> S1: try to acquire M and failed
>>> S2: try to acquire M and succeeded
>>> S2/N1: SELECT i FROM t1; -- i = 0 in session S2
>>> S2/N2: SELECT i FROM t1; -- i = 0 in session S2
>>> S2: release M
>>> S1: wake up
>>> S1/N1: COMMIT;
>>> S2/N1: COMMIT;
>>>
>>> As you see, we can now avoid the inconsistency.
>>>
>>> Also [1] claims that only read only transaction can distribute SELECT
>>> (load balance). Currently Pgpool-II load balance read only SELECT
>>> whenever possible. i.e. if SELECT is followed by write query, start
>>> transaction (if no transaction is started).
>>>
>>> [1] http://www.vldb.org/pvldb/vol2/vldb09-694.pdf
>>>
>>> Comments and suggestions are welcome.
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese:http://www.sraoss.co.jp
>>> _______________________________________________
>>> pgpool-hackers mailing list
>>> pgpool-hackers at pgpool.net
>>> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
> _______________________________________________
> pgpool-hackers mailing list
> pgpool-hackers at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
More information about the pgpool-hackers
mailing list