[pgpool-hackers: 4300] Re: Occasional 005.jdbc test failure

Thu Apr 6 12:55:29 JST 2023

>> We occasionaly see 005.jdbc test failure. Typical error is something
>> like this:
>> 
>> 2023-02-22 08:51:47.704: PostgreSQL JDBC Driver pid 12420: LOG:  DB node id: 0 backend pid: 12488 statement: Parse: COMMIT
>> 2023-02-22 08:51:47.705: PostgreSQL JDBC Driver pid 12420: LOG:  pool_send_and_wait: Error or notice message from backend: : DB node id: 0 backend pid: 12488 statement: "COMMIT" message: "prepared statement "S_1" already exists"
>> 2023-02-22 08:51:47.705: PostgreSQL JDBC Driver pid 12420: LOG:  Parse: Error or notice message from backend: : DB node id: 0 backend pid: 12488 statement: "COMMIT" message: "prepared statement "S_1" already exists"
>> 
>> I found possible cause of the error two seconds ago:
>> 
>> 2023-02-22 08:51:45.241: PostgreSQL JDBC Driver pid 12420: LOG:  Sync message from frontend.
>> 2023-02-22 08:51:45.241: PostgreSQL JDBC Driver pid 12420: LOG:  Terminate message from frontend.
>> 2023-02-22 08:51:45.241: PostgreSQL JDBC Driver pid 12420: LOG:  DB node id: 0 backend pid: 12488 statement: DISCARD ALL
>> 2023-02-22 08:51:45.242: PostgreSQL JDBC Driver pid 12420: LOG:  pool_send_and_wait: Error or notice message from backend: : DB node id: 0 backend pid: 12488 statement: "DISCARD ALL" message: "DISCARD ALL cannot be executed within a pipeline"
>> 
>> "DISCARD ALL" was generated by pgpool (reset_query_list) to discard
>> some objects including prepared statements created in the
>> session. Since DISCARD ALL failed, the prepared statement S_1 was not
>> removed. Thus the next session failed because S_1 already existed.
>> 
>> Question is why "DISCARD ALL cannot be executed within a pipeline"
>> error?
>> 
>> So I consult PostgreSQL document:
>> https://www.postgresql.org/docs/15/protocol-flow.html#PROTOCOL-FLOW-PIPELINING
>> 
>> "However, there are a few DDL commands (such as CREATE DATABASE) that
>> cannot be executed inside a transaction block. If one of these is
>> executed in a pipeline, it will fail unless it is the first command in
>> the pipeline."
>> 
>> "DISCARD ALL" is on the same boat as CREATE DATABASE (i.e. cannot be
>> executed inside a transaction block). It seems we have to wait for
>> pipeline being closed before issuing DISCARD ALL. How? Our PostgreSQL
>> doc says:
>> 
>> "When using this method, completion of the pipeline must be determined
>> by counting ReadyForQuery messages and waiting for that to reach the
>> number of Syncs sent."
>> 
>> This means that we should not issue DISCARD ALL before receiving at
>> least one ReadyForQuery from backend.
>> 
>> Waiting for ReadyForQuery is not a big deal. But pgpool cannot
>> unconditionaly wait for ReadyForQuery. What if ReadyForQuery was
>> already sent? I am going to study this issue.
> 
> After studying the issue I came to a conclusion that there's no
> reliable way to wait or not wait for ReadyForQuery in a reliable
> way. So attached is a patch trying to attack the issue in completely
> different way: instead of reading ReadyForQuery, discard the
> connection pool if pipleline or any other error occurred.  For this
> purpose a global variable "reset_query_error" is introduced. In the
> reset query loop, SimpleQuery() is called, then it calls
> pool_send_and_wait(), and it calls
> per_node_error_log(). per_node_error_log() checks whether backend
> returns error response. If so, reset_query_error" is set to true in
> per_node_error_log(). If it is set to true, backend_cleanup() discards
> the connection pool. This is a little bit ugly but I cannot think of
> better implementation at this point.
> 
> Comments/suggestions are welcome.

I have pushed slightly modified version of the patch. For now I pushed
only to master branch. If it works, I will back patch to all supported
branches.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp