In this Tutrial, I explain a simple way to try "On memory query cache" with shared memory.
What you need is a Linux box in which PostgreSQL and pgpool-II are installed. You can use on memory query cache with pgpool in any mode: replication mode, master/slave mode and raw mode.
Set "memory_cache_enabled" on to enable the functionality of on memory query cache.
# If on, use the memory cache functionality, off by default memory_cache_enabled = on
Pgpool creates "OID map" files which indicates that each cache are for which database and table. Specify the directory for it as you like.
# Temporary work directory to record table oids memqcache_oiddir = '/var/log/pgpool/oiddir'
The parameter "memqcache_method" is the choice of the cache storage, and it defaults to "shmem".
# Cache storage method. either 'shmem'(shared memory) or 'memcached'. 'shmem' by default memqcache_method = 'shmem'
The parameter "log_per_node_statement" helps you to know in which backend node a query is executed. Set this parameter on.
log_per_node_statement = on # Log all statements # with node and backend informations
Start pgpool-II with "-n" (which means not-daemon mode) and redirect log messages into pgpool.log.
$ {installed_dir}/bin/pgpool -n -f {installed_dir}/etc/pgpool.conf > pgpool.log 2>&1
Create a database and a table.
$ createdb -p 9999 test $ psql -p 9999 test test=> CREATE TABLE table_cache_ok (a INT); CREATE TABLE test=> INSERT INTO table_cache_ok VALUES (1), (2), (3); INSERT 0 3 test=> SELECT * FROM table_cache_ok ORDER BY a; a --- 1 2 3 (3 rows)
Execute a SELECT.
test=> SELECT * FROM table_cache_ok WHERE a = 1; a --- 1 (1 row)
The output of log_per_node_statement shows that the SELECT is executed exactly.
LOG: DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1;
Execute the same SELECT.
test=> SELECT * FROM table_cache_ok WHERE a = 1; a --- 1 (1 row)
This time the log message shows that the result of SELECT is from caches.
LOG: query result fetched from cache. statement: SELECT * FROM table_cache_ok WHERE a = 1;
When a table is updated, the cached results of SELECT from the table can be outdated. Pgpool discards all the cache of that table automatically by default.
Specify the action by "memqcache_auto_cache_invalidation". It's set "on" by default.
# If on, invalidation of query cache is triggered by corresponding # DDL/DML/DCL(and memqcache_expire). If off, it is only triggered # by memqcache_expire. on by default. memqcache_auto_cache_invalidation = on
Restart pgpool after changing the parameter's value.
Confirm that you can get the result of a SELECT from cache.
test=> SELECT * FROM table_cache_ok WHERE a = 1; LOG: query result fetched from cache. statement: SELECT * FROM table_cache_ok WHERE a = 1;
Execute INSERT INTO the table.
test=> INSERT INTO table_cache_ok VALUES (5); INSERT 0 1
Execute the same SELECT, and the SELECT is executed exactly.
test=> SELECT * FROM table_cache_ok WHERE a = 1; LOG: DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1;
Pgpool discards caches older than the specified expiration period.
Specify the expiration period by "memqcache_expire". Default is 0 seconds, but set 5 seconds to confirm for trial this time.
# Memory cache entry life time specified in seconds. # 0 means infinite life time. 0 by default. memqcache_expire = 5
Restart pgpool after changing the parameter's value.
Confirm that you can get the result of a SELECT from cache.
test=> SELECT * FROM table_cache_ok WHERE a = 1; LOG: query result fetched from cache. statement: SELECT * FROM table_cache_ok WHERE a = 1;
Wait in 5 seconds...
Execute the same SELECT, and the SELECT is executed exactly.
test=> SELECT * FROM table_cache_ok WHERE a = 1; LOG: DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1;
If you want to cache the result of only some tabeles, specify the tables by "white_memqcache_table_list".
You can use regular expression (to which added automatically ^ and $).
white_memqcache_table_list = '.*_cache_ok' # Comma separated list of table names to memcache # that don't write to database # Regexp are accepted
Reload pgpool after changing the parameter's value.
$ {installed_dir}/bin/pgpool reload
Create a table not to cache.
$ psql -p 9999 test test=> CREATE TABLE table_cache_ng (a INT); CREATE TABLE test=> INSERT INTO table_cache_ng VALUES (1), (2), (3); INSERT 0 3
Confirm if the SELECT FROM a table in white_memqcache_table_list is cached.
test=> SELECT * FROM table_cache_ok WHERE a = 1; LOG: DB node id: 0 backend pid: 11203 statement: SELECT oid FROM pg_database WHERE datname = 'test' test=> SELECT * FROM table_cache_ok WHERE a = 1; LOG: query result fetched from cache. statement: SELECT * FROM table_cache_ok WHERE a = 1;
Confirm if the SELECT FROM a table NOT in white_memqcache_table_list is NOT cached.
test=> SELECT * FROM table_cache_ng WHERE a = 1; LOG: DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1; test=> SELECT * FROM table_cache_ng WHERE a = 1; LOG: DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1;
If you don't want to cache the result of some tabeles, specify the tables by "black_memqcache_table_list".
You can use regular expression (to which added automatically ^ and $).
black_memqcache_table_list = '.*_cache_ng' # Comma separated list of table names not to memcache # that don't write to database # Regexp are accepted
Reload pgpool after changing the parameter's value.
$ {installed_dir}/bin/pgpool reload
Confirm if the SELECT FROM a table in black_memqcache_table_list is NOT cached.
test=> SELECT * FROM table_cache_ng WHERE a = 1; LOG: DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1; test=> SELECT * FROM table_cache_ng WHERE a = 1; LOG: DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1;
Confirm if the SELECT FROM a table NOT in black_memqcache_table_list is cached.
test=> SELECT * FROM table_cache_ok WHERE a = 1; LOG: DB node id: 0 backend pid: 11203 statement: SELECT oid FROM pg_database WHERE datname = 'test' test=> SELECT * FROM table_cache_ok WHERE a = 1; LOG: query result fetched from cache. statement: SELECT * FROM table_cache_ok WHERE a = 1;
The size of each SELECT result is limited by "memqcache_maxcache".
# Maximum SELECT result size in bytes. # Must be smaller than memqcache_cache_block_size. Defaults to 400KB. memqcache_maxcache = 409600
If a result is bigger than memqcache_maxcache, the log message tells us that. This is an example when memqcache_maxcache is set 1024 byte and SELECTed table is bigger than 10 MB.
LOG: DB node id: 0 backend pid: 17749 statement: SELECT * FROM pgbench_accounts ; LOG: pool_add_temp_query_cache: data size exceeds memqcache_maxcache. current:983 requested:110 memq_maxcache:1024
If white list and black list are not enough for you, put the text "/* NO QUERY CACHE */" at the head of the queries.
LOG: statement: SELECT * FROM table_cache_ok WHERE a = 1; LOG: query result fetched from cache. statement: SELECT * FROM table_cache_ok WHERE a = 1; LOG: statement: /* NO QUERY CACHE */ SELECT * FROM table_cache_ok WHERE a = 1; LOG: DB node id: 0 backend pid: 18070 statement: /* NO QUERY CACHE */ SELECT * FROM table_cache_ok WHERE a = 1; LOG: statement: /* NO QUERY CACHE */ SELECT * FROM table_cache_ok WHERE a = 1; LOG: DB node id: 0 backend pid: 18070 statement: /* NO QUERY CACHE */ SELECT * FROM table_cache_ok WHERE a = 1;
This way requires to change your program of course.