[pgpool-hackers: 3124] New feature proposal: shared relation cache
Tatsuo Ishii
ishii at sraoss.co.jp
Thu Nov 15 13:20:44 JST 2018
* Proposal background
Unlike PostgreSQL Pgpool-II cannot directly access system catalogs, it
issues SQL to PostgreSQL system catalog whenever necessary. For
example to execute a simple query
select count(*) from pgbench_accounts;
Pgpool-II issues 9 SELECTs to system catalogs:
2018-11-15 12:57:34: pid 14885: LOG: DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE p.proname = 'count' AND p.provolatile = 'i'
2018-11-15 12:57:34: pid 14885: LOG: DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_class AS c, pg_attribute AS a WHERE c.relname = 'pg_class' AND a.attrelid = c.oid AND a.attname = 'relistemp'
2018-11-15 12:57:34: pid 14885: LOG: DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_class AS c, pg_namespace AS n WHERE c.relname = 'pgbench_accounts' AND c.relnamespace = n.oid AND n.nspname ~ '^pg_temp_'
2018-11-15 12:57:34: pid 14885: LOG: DB node id: 0 backend pid: 14904 statement: SELECT count(*) from (SELECT has_function_privilege('t-ishii', 'pg_catalog.to_regclass(cstring)', 'execute') WHERE EXISTS(SELECT * FROM pg_catalog.pg_proc AS p WHERE p.proname = 'to_regclass')) AS s
2018-11-15 12:57:34: pid 14885: LOG: DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_class AS c WHERE c.relname = 'pg_namespace'
2018-11-15 12:57:34: pid 14885: LOG: DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_class AS c, pg_namespace AS n WHERE c.oid = pg_catalog.to_regclass('"pgbench_accounts"') AND c.relnamespace = n.oid AND n.nspname = 'pg_catalog'
2018-11-15 12:57:34: pid 14885: LOG: DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_class AS c WHERE c.oid = pg_catalog.to_regclass('"pgbench_accounts"') AND (c.relkind = 'v' OR c.relkind = 'm')
2018-11-15 12:57:34: pid 14885: LOG: DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_class AS c, pg_catalog.pg_attribute AS a WHERE c.relname = 'pg_class' AND a.attrelid = c.oid AND a.attname = 'relpersistence'
2018-11-15 12:57:34: pid 14885: LOG: DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_class AS c WHERE c.oid = pg_catalog.to_regclass('"pgbench_accounts"') AND c.relpersistence = 'u'
Of course this is only done for the first time the table (in this case
pgbench_accounts) is accessed. The query result is cached and
subsequent queries use the cache.
However the cache is in pgpool child process local memory. That means:
1) the cache is not shared among pgpool child process
2) if the child process dies, newly created pgpool child process needs
to re-select the system catalogs.
Especially #1 is a headache if num_init_children is large since each
pgpool child needs to issue query. For example if num_init_children
is 100, as many as 900 queries will be sent to PostgreSQL for the
first time.
And I only show the data for single table. If user needs to access 100
different tables, 900*1000 = 90,000 SELECTs are necessary.
* The proposal
So I propose to move the relation cache (relcache) to shared
memory. This should eliminate the problem #1 since now the cache can
be shared different pgpool child process. Moreover the cache remains
even after the pgpool child exits, and problem #2 will be eliminated
together. In the example above the number of SELECTs issued with this
approch will reduce from 90,000 to 9.
* Implementation
Instead of re-invent a relcache in shared memory, I propose to use
existing infrastructure, namely the in memory query cache. By adding a
few low level API, the relcache data can be easily placed on query
cache storage. A bonus of this plan is, the relache can be on
memcached without any additional coding.
Comments and/or suggestuons are welcome.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
More information about the pgpool-hackers
mailing list