[pgpool-hackers: 3243] Re: Some performance improvements for Pgpool-II

Mon Feb 18 13:32:37 JST 2019

> Hi Ishii San
> 
> Can you have a look at the attached patch which tries to extract some
> performance in the area of query parsing and query analysis for routing
> decisions. Most of the performance gains from the changes in the patch can
> be observed in large data INSERT statements.
> 
> Patch contains the following changes
> ==========
> 1-- The idea here is since Pgpool-II only needs a very little information
> about the queries especially for the insert queries to decide where it
> needs to send the query,
> for example: For the INSERT queries we only need the type of query and the
> relation name.
> But since the parser we use in Pgpool-II is taken from PostgreSQL source
> which parses the complete query including the value lists ( which is not
> required by Pgpool).
> This parsing of value part seems very harmless in small statements but in
> case of INSERTs with lots of column values and large data in each value
> item, this becomes significant.
> So this patch adds a smaller bison grammar rule to short circuit the INSERT
> statement parsing when it gets the enough information required for
> Pgpool-II.
> 
> 2-- The patch also re-arranges some of the if statements in
> pool_where_to_send() function and tries to make sure the pattern_compare
> and pool_has_function_call calls should only be made when they are
> absolutely necessary.
> 
> 3--Another thing this patch does is, it tries to save the raw_parser()
> calls in case of un-recognised queries. Instead of invoking the parser of
> "dummy read" and "dummy write" queries in case of syntax error in original
> query, the patch adds the functions to get pre-built parse_trees for these
> dummy queries.
> 
> 4-- strlen() call is removed from scanner_init() function and is passed in
> as an argument to it, and the reason is we already have the query length in
> most cases before invoking the parser so why waste CPU cycles on it. Again
> this becomes significant in case of large query strings.
> 
> Finally the patch tries to remove the unnecessary calls of
> pool_is_likely_select()
> 
> As mentioned above the area of improvements in this patch are mostly around
> writing queries and for the testing purpose I used a INSERT query with
> large binary insert data and I am getting a very huge performance gains
> with this patch
> 
> *Current Master Branch*
> ================
> usama=# \i sample.sql
>  id
> -----
>  104
> (1 row)
> 
> INSERT 0 1
> Time: *2059.807* ms (00:02.060)
> 
> *WITH PATCH*
> ===============
> usama=# \i sample.sql
>  id
> -----
>  102
> (1 row)
> 
> INSERT 0 1
> Time: *314.237* ms
> 
> 
> Performance gain* 655.50 %*
> 
> 
> Comments and suggestions?
> 
> Please let me know if you also want the test data I used for the INSERT test

Yes, please share it with me.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp