5.12. In Memory Query Cache

In memory query cache can be used with all modes of Pgpool-II. The in memory query cache keeps the SELECT results and the results are reused. When the underlying table gets updated, corresponding cache entries are deleted (in case memqcache_auto_cache_invalidation is on. This parameter is on by default), thus restarting Pgpool-II is not necessary.

In memory cache saves the pair of user name, database name, SELECT statement and its result (along with the Bind parameters, if the SELECT is an extended query). If the same user name, database name and SELECTs comes in, Pgpool-II returns the value from cache. Since no SQL parsing nor access to PostgreSQL are involved, the serving of results from the in memory cache is extremely fast.

Note: Basically following SELECTs will not be cached:

    SELECTs including non immutable functions
    SELECTs including temp tables, unlogged tables
    SELECTs including TIMESTAMP WITH TIMEZONE or TIME WITH TIMEZONE
    SELECTs including CAST to TIMESTAMP WITH TIMEZONE or TIME WITH TIMEZONE
    SELECTs including SQLValueFunction (CURRENT_TIME, CURRENT_USER etc.)
    SELECT result is too large (memqcache_maxcache)
    SELECT FOR SHARE/UPDATE
    SELECT starting with "/*NO QUERY CACHE*/" comment
    SELECT including system catalogs
    SELECT uses TABLESAMPLE
    SELECT uses row security enabled tables
   

However, VIEWs and SELECTs accessing unlogged tables can be cached by specifying in the white_memqcache_table_list.

Note: Since consistency between the query cache and database content can be lost by following commands, the query cache is all deleted if they are executed:

    ALTER DATABASE
    ALTER ROLE
    ALTER TABLE
    REVOKE
   

Moreover, if SET ROLE or SET SESSION AUTHORIZATION are executed, the query cache will not be used and new cache will not be created in the session. Pgpool-II assumes that two query cache are different if their session users are different. On the other hand, PostgreSQL manages access control depending on current user. Thus these commands may break the consistency between these two access controls. However, cache invalidation by update commands are performed in the same way in that these commands are not used.

On the other hand, it might be slower than the normal path in some cases, because it adds some overhead to store cache. Moreover when a table is updated, Pgpool-II automatically deletes all the caches related to the table. Therefore, the performance will be degraded by a system with a lot of updates. If the query cache hit ratio (it can be checked by using SHOW POOL_CACHE) is lower than 70%, you might want to disable in memory cache.

A SELECT result is basically registered when SELECT normally finishes. If an explicit transaction is executing, SELECT result is not registered until the transaction is committed. Also in extended query mode, the timing of cache registration varies depending on clustering mode. In streaming replication mode and logical replication mode, cache is registered when Sync message is sent from frontend, and the response (Ready for query message) is returned from backend. Thus even if commands are sent from frontend, the second Execute (SELECT 1) will not use query cache.

   Parse (SELECT 1)
   Bind (SELECT 1)
   Execute (SELECT 1)
   Parse (SELECT 1)
   Bind (SELECT 1)
   Execute (SELECT 1)
   Sync
  

On the other hand, in other clustering mode, since the result of first Execute (SELECT 1) is registered, the second Execute (SELECT 1) will use the query cache.

5.12.1. Enabling in memory query cache

memory_cache_enabled (boolean)

Setting to on enables the memory cache. Default is off.

This parameter can only be set at server start.

Note: The query cache will also be used by shared relation cache if enable_shared_relcache is set to on. Moreover the query cache is used even if memory_cache_enabled parameter is set to off. See Section 5.15 for more details to relation cache.

5.12.2. Choosing cache storage

memqcache_method (string)

Specifies the storage type to be used for the cache. Below table contains the list of all valid values for the parameter.

Table 5-10. Memqcache method options

ValueDescription
'shmem'Use shared memory
'memcached'Use memcached

In general shmem is faster than memcached because the former does not involve any network access. However, the upper limit size of memqcache_total_size is limited by the total shared memory size defined by the system. The size is large enough in modern Linux systems, but in other systems there maybe smaller upper limit. In this case you need to change the system settings regarding the maximum shared memory size. There's a good explanation about shared memory in PostgreSQL documentation.

If you are not sure which memqcache_method to be used, start with shmem.

Default is 'shmem'.

This parameter can only be set at server start.

5.12.3. Common configurations

These below parameter are valid for both shmem and memcached type query cache.

memqcache_expire (integer)

Specifies the life time of query cache in seconds. Default is 0. which means no cache expiration and cache remains valid until the table is updated.

This parameter can only be set at server start.

Note: memqcache_expire and memqcache_auto_cache_invalidation are orthogonal to each other.

memqcache_auto_cache_invalidation (boolean)

Setting to on, automatically deletes the cache related to the updated tables. When off, cache is not deleted.

Default is on.

Note: This parameters memqcache_auto_cache_invalidation and memqcache_expire are orthogonal to each other.

This parameter can be changed by reloading the Pgpool-II configurations.

memqcache_maxcache (integer)

Specifies the maximum size in bytes of the SELECT query result to be cached. The result with data size larger than this value will not be cached by Pgpool-II. When the caching of data is rejected because of the size constraint the following message is shown.

       LOG:   pid 13756: pool_add_temp_query_cache: data size exceeds memqcache_maxcache. current:4095 requested:111 memq_maxcache:4096
      

Note: For the shared memory query('shmem') cache the memqcache_maxcache must be set lower than memqcache_cache_block_size and for 'memcached' it must be lower than the size of slab (default is 1 MB).

This parameter can only be set at server start.

white_memqcache_table_list (string)

Specifies a comma separated list of table names whose SELECT results should be cached by Pgpool-II. This parameter only applies to VIEWs and SELECTs accessing unlogged tables. Regular tables can be cached unless specified by black_memqcache_table_list.

You can use regular expression into the list to match table name (to which ^ and $ are automatically added).

Note: If the queries can refer the table with and without the schema qualification then you must add both entries(with and without schema name) in the list.

	#For example:
	#If the queries sometime use "table1" and other times "public.table1"
	#to refer the table1 then the white_memqcache_table_list
	#would be configured as follows.

	white_memqcache_table_list = "table1,public.table1"

       

This parameter can be changed by reloading the Pgpool-II configurations.

black_memqcache_table_list (string)

Specifies a comma separated list of table names whose SELECT results should NOT be cached by the Pgpool-II.

You can use regular expression into the list to match table name (to which ^ and $ are automatically added),

Note: If the queries can refer the table with and without the schema qualification then you must add both entries(with and without schema name) in the list.

	#For example:
	#If the queries sometime use "table1" and other times "public.table1"
	#to refer the table1 then the black_memqcache_table_list
	#would be configured as follows.

	black_memqcache_table_list = "table1,public.table1"

       

This parameter can be changed by reloading the Pgpool-II configurations.

Note: black_memqcache_table_list precedence over white_memqcache_table_list

memqcache_oiddir (string)

Specifies the full path to the directory for storing the oids of tables used by SELECT queries.

memqcache_oiddir directory contains the sub directories for the databases. The directory name is the OID of the database. In addition, each database directory contains the files for each table used by SELECT statement. Again the name of the file is the OID of the table. These files contains the pointers to query cache which are used as key for deleting the caches.

Note: Normal restart of Pgpool-II does not clear the contents of memqcache_oiddir.

This parameter can be changed by reloading the Pgpool-II configurations.

5.12.4. Configurations to use shared memory

These are the parameters used with shared memory as the cache storage.

memqcache_total_size (integer)

Specifies the shared memory cache size. The default is 64MB.

Each cache is stored in fixed-size blocks specified by memqcache_cache_block_size. The number of blocks can be calculated by memqcache_total_size / memqcache_cache_block_size. Any decimal places are truncated so that the number of blocks is an integer. If the number becomes 0, an error is raised. That is, memqcache_total_size must be greater than memqcache_cache_block_size. Query results and their management data are not stored across multiple blocks, so if the query result data length + 64 bytes is greater than memqcache_cache_block_size, it cannot be stored in a block and will not be cached.

When one block is filled with cache, the next block is used. When all blocks are full, the oldest block is flushed and reused (Note that the oldest block does not necessarily holds the oldest cache data because new cache data is registered whenever the first fitting space is found). While smaller memqcache_total_size does not raise an error, performance decreases because the cache hit ratio decreases. The cache hit ratio can be checked by consulting cache_hit_ratio of SHOW POOL_CACHE.

This parameter can only be set at server start.

memqcache_max_num_cache (integer)

Specifies the number of cache entries. The default is 1,000,000.

If you attempt to register more than memqcache_max_num_cache caches, a block containing cached data is flushed and reused. While smaller memqcache_max_num_cache does not raise an error, performance decreases because the cache hit ratio decreases. The cache hit ratio can be checked by consulting cache_hit_ratio of SHOW POOL_CACHE.

Note: The cache is managed by a hash table in shared memory for fast access. The hash table space size can be calculated by: memqcache_max_num_cache * 64 bytes. Number of hash entries can be found in used_hash_entries of SHOW POOL_CACHE. Number of the hash table entries usually matches memqcache_max_num_cache, but may not match if memqcache_max_num_cache is not a power of 2. In this case it is rounded to a power of 2 greater than memqcache_max_num_cache. Number of hash entries in use can be found at used_hash_entries.

This parameter can only be set at server start.

memqcache_cache_block_size (integer)

Specifies the cache block size. The default is 1MB.

Query results and their management data are not stored across multiple blocks, so if the query result data length + 64 bytes is greater than memqcache_cache_block_size, it cannot be stored in a block and will not be cached.

memqcache_cache_block_size must be set to at least 512.

This parameter can only be set at server start.

5.12.5. Configurations to use memcached

These are the parameters used with memcached as the cache storage.

memqcache_memcached_host (string)

Specifies the host name or the IP address on which memcached works. You can use 'localhost' if memcached and Pgpool-II resides on same server.

This parameter can only be set at server start.

memqcache_memcached_port (integer)

Specifies the port number of memcached. Default is 11211.

This parameter can only be set at server start.