[pgpool-general: 9004] Re: Massive performance and latency issues when remote replica is attached to cluster.
Tatsuo Ishii
ishii at sraoss.co.jp
Thu Jan 25 15:34:27 JST 2024
> Brian,
>
> I have looked into the log and found some queries were sent to remote node (node id 2).
> This is pretty strange because you said:
>> Load-balancing is turned off.
>
> In this case every queries should be sent to primary node only (node
> id 1).
>
> Are you sure that you set load_balance_node to off?
~~~~~~~~~~~~~~~~
I meant "load_balance_mode".
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS LLC
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp
>
>> Tatsuo,
>>
>> As requested, I re-ran pgbench with the log_per_node_statement option
>> turned on.
>> I included just a small excerpt, but everything I saw looked to be the same
>> pattern.
>>
>> $ pgbench -h 10.1.1.1 -p1234 -U pgbench -c 75 -j 12 -t 500 -P 15 -r
>> pgbench1 -M extended
>> starting vacuum...end.
>> progress: 15.0 s, 1044.2 tps, lat 69.690 ms stddev 7.344
>> progress: 30.0 s, 1060.1 tps, lat 69.940 ms stddev 7.635
>> transaction type: <builtin: TPC-B (sort of)>
>> scaling factor: 100
>> query mode: extended
>> number of clients: 75
>> number of threads: 12
>> number of transactions per client: 500
>> number of transactions actually processed: 37500/37500
>> latency average = 69.725 ms
>> latency stddev = 7.464 ms
>> tps = 1003.125524 (including connections establishing)
>> tps = 1004.224113 (excluding connections establishing)
>> statement latencies in milliseconds:
>> 0.001 \set aid random(1, 100000 * :scale)
>> 0.001 \set bid random(1, 1 * :scale)
>> 0.000 \set tid random(1, 10 * :scale)
>> 0.000 \set delta random(-5000, 5000)
>> 11.958 BEGIN;
>> 8.933 UPDATE pgbench_accounts SET abalance = abalance + :delta
>> WHERE aid = :aid;
>> 8.921 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
>> 9.230 UPDATE pgbench_tellers SET tbalance = tbalance + :delta
>> WHERE tid = :tid;
>> 10.102 UPDATE pgbench_branches SET bbalance = bbalance + :delta
>> WHERE bid = :bid;
>> 8.907 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
>> VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
>> 11.671 END;
>>
>>
>>
>>
>> -----------------------
>>
>> Looped through the logfile to grab the connections coming from pgbench and
>> then find those:
>>
>> for i in `cat pgpool-Tue.log | grep "connecting\ host" | grep
>> "10\.1\.1\.150"|awk '{print $4}'|sed "s/://"`;do cat pgpool-Tue.log |grep
>> $i;done|less
>>
>> A snippet from there:
>> NOTE: node_id 0 and 1 are the "local" db_nodes
>> node_id 2 is the "remote" db_node
>> node_id 1 is the current primary.
>>
>>
>>
>> 2024-01-23 10:10:03: pid 4507: LOG: new connection received
>> 2024-01-23 10:10:03: pid 4507: DETAIL: connecting host=10.1.1.150
>> port=52802
>> 2024-01-23 10:10:03: pid 4507: LOCATION: child.c:1860
>> 2024-01-23 10:10:03: pid 4507: LOG: statement: select count(*) from
>> pgbench_branches
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:224
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: SELECT pg_catalog.version()
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: SELECT count(*) FROM pg_catalog.pg_proc AS p,
>> pg_catalog.pg_namespace AS n WHERE p.proname = 'count' AND n.oid =
>> p.pronamespace AND n.nspname ~ '.*' AND p.provolatile = 'v'
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: select count(*) from pgbench_branches
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: statement: vacuum pgbench_branches
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:224
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: vacuum pgbench_branches
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: statement: vacuum pgbench_tellers
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:224
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: vacuum pgbench_tellers
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: statement: truncate pgbench_history
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:224
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: truncate pgbench_history
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: statement: DISCARD ALL
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:224
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 0 backend pid: 19865
>> statement: DISCARD ALL
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: DISCARD ALL
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 2 backend pid: 17091
>> statement: DISCARD ALL
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: frontend disconnection: session time:
>> 0:00:00.215 user=pgbench database=pgbench1 host=10.1.1.150 port=52802
>> 2024-01-23 10:10:03: pid 4507: LOCATION: child.c:2076
>> 2024-01-23 10:10:03: pid 4507: LOG: new connection received
>> 2024-01-23 10:10:03: pid 4507: DETAIL: connecting host=10.1.1.150
>> port=52828
>> 2024-01-23 10:10:03: pid 4507: LOCATION: child.c:1860
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 0 backend pid: 19865
>> statement: Parse: BEGIN;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Parse: BEGIN;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 2 backend pid: 17091
>> statement: Parse: BEGIN;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 0 backend pid: 19865
>> statement: Bind: BEGIN;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Bind: BEGIN;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 2 backend pid: 17091
>> statement: Bind: BEGIN;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 0 backend pid: 19865
>> statement: D message
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: D message
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 2 backend pid: 17091
>> statement: D message
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: statement: BEGIN;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:914
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 0 backend pid: 19865
>> statement: Execute: BEGIN;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Execute: BEGIN;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 2 backend pid: 17091
>> statement: Execute: BEGIN;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Parse: UPDATE pgbench_accounts SET abalance = abalance + $1
>> WHERE aid = $2;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Bind: UPDATE pgbench_accounts SET abalance = abalance + $1 WHERE
>> aid = $2;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: D message
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: statement: UPDATE pgbench_accounts SET
>> abalance = abalance + $1 WHERE aid = $2;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:914
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Execute: UPDATE pgbench_accounts SET abalance = abalance + $1
>> WHERE aid = $2;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Parse: SELECT abalance FROM pgbench_accounts WHERE aid = $1;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Bind: SELECT abalance FROM pgbench_accounts WHERE aid = $1;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: D message
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: statement: SELECT abalance FROM
>> pgbench_accounts WHERE aid = $1;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:914
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Execute: SELECT abalance FROM pgbench_accounts WHERE aid = $1;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Parse: UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE
>> tid = $2;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Bind: UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE
>> tid = $2;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: D message
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: statement: UPDATE pgbench_tellers SET
>> tbalance = tbalance + $1 WHERE tid = $2;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:914
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Execute: UPDATE pgbench_tellers SET tbalance = tbalance + $1
>> WHERE tid = $2;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Parse: UPDATE pgbench_branches SET bbalance = bbalance + $1
>> WHERE bid = $2;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Bind: UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE
>> bid = $2;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: D message
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: statement: UPDATE pgbench_branches SET
>> bbalance = bbalance + $1 WHERE bid = $2;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:914
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Execute: UPDATE pgbench_branches SET bbalance = bbalance + $1
>> WHERE bid = $2;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Parse: INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
>> VALUES ($1, $2, $3, $4, CURRENT_TIMESTAMP);
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Bind: INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
>> VALUES ($1, $2, $3, $4, CURRENT_TIMESTAMP);
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: D message
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: statement: INSERT INTO pgbench_history
>> (tid, bid, aid, delta, mtime) VALUES ($1, $2, $3, $4, CURRENT_TIMESTAMP);
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:914
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Execute: INSERT INTO pgbench_history (tid, bid, aid, delta,
>> mtime) VALUES ($1, $2, $3, $4, CURRENT_TIMESTAMP);
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 0 backend pid: 19865
>> statement: Parse: END;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Parse: END;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 2 backend pid: 17091
>> statement: Parse: END;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 0 backend pid: 19865
>> statement: Bind: END;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Bind: END;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 2 backend pid: 17091
>> statement: Bind: END;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 0 backend pid: 19865
>> statement: D message
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: D message
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 2 backend pid: 17091
>> statement: D message
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: statement: END;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:914
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 0 backend pid: 19865
>> statement: Execute: END;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 1 backend pid: 6857
>> statement: Execute: END;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>> 2024-01-23 10:10:03: pid 4507: LOG: DB node id: 2 backend pid: 17091
>> statement: Execute: END;
>> 2024-01-23 10:10:03: pid 4507: LOCATION: pool_proto_modules.c:3596
>>
>> --
>> Brian Ismay
>> jismay at cenic.org
>>
>> On Sat, Jan 20, 2024 at 10:39 PM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>>
>>> > Any recommendations would be greatly appreciated. So far, after making
>>> > multiple tuning and settings changes there have been no meaningful
>>> changes.
>>> >
>>> > What we are highly suspicious of is that the latency numbers seem to go
>>> > from around .2 ms to around 8 ms. This is very close to the
>>> round-trip-time
>>> > between our datacenters.
>>>
>>> > statement latencies in milliseconds:
>>> > 0.003 \set aid random(1, 100000 * :scale)
>>> > 0.001 \set bid random(1, 1 * :scale)
>>> > 0.001 \set tid random(1, 10 * :scale)
>>> > 0.001 \set delta random(-5000, 5000)
>>> > 8.844 BEGIN;
>>> > 9.695 UPDATE pgbench_accounts SET abalance = abalance + :delta
>>> > WHERE aid = :aid;
>>> > 8.878 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
>>> > 685.873 UPDATE pgbench_tellers SET tbalance = tbalance + :delta
>>> > WHERE tid = :tid;
>>> > 92.049 UPDATE pgbench_branches SET bbalance = bbalance + :delta
>>> > WHERE bid = :bid;
>>> > 8.826 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
>>> > VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
>>> > 8.847 END;
>>>
>>> If BEGIN and END were sent to the remote standby node, then the
>>> latency number makes sense. But theoretically it should not happen
>>> because you turn off load balance mode. In this case everything should
>>> be sent to primary except the connection establishing. But even if
>>> load balance mode is off, pgpool establish connections to both primary
>>> and standbys. However the default option of pgbench is establishing
>>> connection only once at the beginning of session, meaning the
>>> connection overhead should only happen only once in the session. Thus
>>> I can not think of plausible reason for the latency numbers.
>>>
>>> Anyway, can you turn on log_per_node_statement and see everything is
>>> sent to standby? I want to confirm that first.
>>>
>>> Best reagards,
>>> --
>>> Tatsuo Ishii
>>> SRA OSS LLC
>>> English: http://www.sraoss.co.jp/index_en/
>>> Japanese:http://www.sraoss.co.jp
>>>
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
More information about the pgpool-general
mailing list