[pgpool-hackers: 3243] Re: Some performance improvements for Pgpool-II
Tatsuo Ishii
ishii at sraoss.co.jp
Mon Feb 18 13:32:37 JST 2019
> Hi Ishii San
>
> Can you have a look at the attached patch which tries to extract some
> performance in the area of query parsing and query analysis for routing
> decisions. Most of the performance gains from the changes in the patch can
> be observed in large data INSERT statements.
>
> Patch contains the following changes
> ==========
> 1-- The idea here is since Pgpool-II only needs a very little information
> about the queries especially for the insert queries to decide where it
> needs to send the query,
> for example: For the INSERT queries we only need the type of query and the
> relation name.
> But since the parser we use in Pgpool-II is taken from PostgreSQL source
> which parses the complete query including the value lists ( which is not
> required by Pgpool).
> This parsing of value part seems very harmless in small statements but in
> case of INSERTs with lots of column values and large data in each value
> item, this becomes significant.
> So this patch adds a smaller bison grammar rule to short circuit the INSERT
> statement parsing when it gets the enough information required for
> Pgpool-II.
>
> 2-- The patch also re-arranges some of the if statements in
> pool_where_to_send() function and tries to make sure the pattern_compare
> and pool_has_function_call calls should only be made when they are
> absolutely necessary.
>
> 3--Another thing this patch does is, it tries to save the raw_parser()
> calls in case of un-recognised queries. Instead of invoking the parser of
> "dummy read" and "dummy write" queries in case of syntax error in original
> query, the patch adds the functions to get pre-built parse_trees for these
> dummy queries.
>
> 4-- strlen() call is removed from scanner_init() function and is passed in
> as an argument to it, and the reason is we already have the query length in
> most cases before invoking the parser so why waste CPU cycles on it. Again
> this becomes significant in case of large query strings.
>
> Finally the patch tries to remove the unnecessary calls of
> pool_is_likely_select()
>
> As mentioned above the area of improvements in this patch are mostly around
> writing queries and for the testing purpose I used a INSERT query with
> large binary insert data and I am getting a very huge performance gains
> with this patch
>
> *Current Master Branch*
> ================
> usama=# \i sample.sql
> id
> -----
> 104
> (1 row)
>
> INSERT 0 1
> Time: *2059.807* ms (00:02.060)
>
> *WITH PATCH*
> ===============
> usama=# \i sample.sql
> id
> -----
> 102
> (1 row)
>
> INSERT 0 1
> Time: *314.237* ms
>
>
> Performance gain* 655.50 %*
>
>
> Comments and suggestions?
>
> Please let me know if you also want the test data I used for the INSERT test
Yes, please share it with me.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
More information about the pgpool-hackers
mailing list