[pgpool-general-jp: 934] Re: PostgreSQL9.0(SR環境)でload_balance_mode = true時のINSERT後のSELECT currvalでマスタ側のシーケンスを取得しないことがある

Tatsuo Ishii ishii @ sraoss.co.jp
2011年 4月 26日 (火) 20:27:37 JST


石井です。

> 石井 様
> 
> pgpoolでいつもお世話になっております。
> 非常に助けられており、このようなソフトを提供頂き感謝しております。
> 
> 下記の件ですが、詳細な調査をした結果、原因が異なりました。申し訳ありません。
> 
> 元々は、StreaminReplicationの非同期までのタイムロスによりSELECT処理の方が早くデータが取れていなかったようです。
> 
> その件を開発者に指摘し、明示的にBEGIN・ENDを発行してからクエリを実行するように処理を変更してもらいましたが、
> それでもマスタには流れていないようです。
> 
> 添付にログを付けましたが、INSERT直後のSELECは問題ありませんが、curvalで取得したseqを元にSELECTを行う際に
> BEGIN ~ END を付けていますが、DB node idが0ではなく、id 2で実行されているようです。

確認させていただきたいのですが、currvalに対するSELECTはプライマリ(node
id 0)にのみ行っているのですね。問題は、その後のcurrvalを含まないSELECT
がプライマリ以外に行ってしまうと。そして、その現象は明示的なトランザク
ションで囲んでも起きると。

申し訳ありませんが、これは pgpool-II 3.xにおける「改良」の結果です。
3.xでは、明示的なトランザクション内のSELECTも負荷分散されるようになった
のです。

対策ですが、問題のSELECT(最後の方の SELECT id, name FROM public.test
WHERE id = '72')にSQLコメントを付けて、

/*NO LOAD BALANCE*/ SELECT id, name FROM public.test WHERE id = '72'

としてください。ちなみに、コメントは、NO LOAD BALANCEでなくても、FOOで
もBARでも何でも良いです。要は、SQL文が"SELECT"で始まっていなければよい
わけです。これで強制的にプライマリでクエリが実行されるようになります。
明示的なトランザクションにする必要はありません。
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> 以上、宜しくお願い致します。
> 
> 
> 2011年4月24日14:16 Tatsuo Ishii <ishii @ sraoss.co.jp>:
>> 石井です。
>>
>> こちらでは再現しないのですが... まずはデバッグオプション(-d -n)を付けて
>> pgpoolを再起動し、スタンバイ(スレーブ)にcurrvalが流れたところまでのログ
>> を見せていただけないでしょうか?結構な量になると思うので、個人メールで送っ
>> ていただいた方が良いと思います。
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>>
>>> 初めまして。
>>> サイバースターの山下と申します。
>>>
>>> 題記の件ですが、現在、PostgreSQL9.0.1 SR(3台) + pgpool-II 3.0.3でSELECTのバランシングを
>>> 行おうと思っているのですが、INSERT後のSELECT curvalでどうもスレーブに処理が流れてしまうことがあります。
>>>
>>> 何か、設定に問題がある等、ご教授頂けないでしょうか?
>>> その他、ログ等が必要でしたら、対応しますので、ご指示頂ければと思います。
>>>
>>> black_function_list = 'nextval,setval,currval,lastval' や insert
>>> lock等も有効にしたりといろいろと思考錯誤はしたのですが、
>>> 解決しないため、何卒宜しくお願い致します。
>>>
>>> ◆発生頻度
>>> 30%程度(約3回に一回程度発生)
>>>
>>> ◆INSERT実行後に実行しているSQL
>>> SELECT currval('step_id_seq') AS step_id;
>>>
>>> ◆エラーメッセージ
>>> Warning: pg_exec() [function.pg-exec]: Query failed: ERROR: currval of
>>> sequence "step_id_seq" is not yet defined in this session in
>>> /home/public_html/test.php on line xxx
>>>
>>> ◆環境
>>> ※その他環境でも現象は確認しており、環境依存でないことは確認しております。
>>>
>>> ■PostgreSQL(9.0.1) x 3台体制
>>> CentOS5.5 x86_64
>>>
>>> 全環境に、pgpool 3.0.3の
>>> pgpool-recovery  pgpool-regclass  pgpool-walrecrunningはインストール済み
>>>
>>> ■pgpool 3.0.3
>>> ScientificLinux
>>>
>>> ■pgpool.conf
>>> #
>>> # pgpool-II configuration file sample for Stream replication/Hot standby.
>>> # $Header: /cvsroot/pgpool/pgpool-II/pgpool.conf.sample-stream,v 1.5
>>> 2010/09/01 04:58:47 kitagawa Exp $
>>>
>>> # Host name or IP address to listen on: '*' for all, '' for no TCP/IP
>>> # connections
>>> #listen_addresses = 'localhost'
>>> listen_addresses = '*'
>>>
>>> # Port number for pgpool
>>> #port = 9999
>>> port = 5432
>>>
>>> # Port number for pgpool communication manager
>>> pcp_port = 9898
>>>
>>> # Unix domain socket path.  (The Debian package defaults to
>>> # /var/run/postgresql.)
>>> socket_dir = '/tmp'
>>>
>>> # Unix domain socket path for pgpool communication manager.
>>> # (Debian package defaults to /var/run/postgresql)
>>> pcp_socket_dir = '/tmp'
>>>
>>> # Unix domain socket path for the backend. Debian package defaults to
>>> /var/run/postgresql!
>>> backend_socket_dir = '/tmp'
>>>
>>> # pgpool communication manager timeout. 0 means no timeout. This
>>> parameter is ignored now.
>>> pcp_timeout = 10
>>>
>>> # number of pre-forked child process
>>> num_init_children = 32
>>>
>>> # Number of connection pools allowed for a child process
>>> max_pool = 4
>>>
>>> # If idle for this many seconds, child exits.  0 means no timeout.
>>> child_life_time = 300
>>>
>>> # If idle for this many seconds, connection to PostgreSQL closes.
>>> # 0 means no timeout.
>>> connection_life_time = 300
>>>
>>> # If child_max_connections connections were received, child exits.
>>> # 0 means no exit.
>>> child_max_connections = 0
>>>
>>> # If client_idle_limit is n (n > 0), the client is forced to be
>>> # disconnected whenever after n seconds idle (even inside an explicit
>>> # transactions!)
>>> # 0 means no disconnect.
>>> client_idle_limit = 0
>>>
>>> # Maximum time in seconds to complete client authentication.
>>> # 0 means no timeout.
>>> authentication_timeout = 60
>>>
>>> # Logging directory
>>> logdir = '/tmp'
>>>
>>> # pid file name
>>> pid_file_name = '/var/run/pgpool/pgpool.pid'
>>>
>>> # Replication mode
>>> replication_mode = false
>>>
>>> # Load balancing mode, i.e., all SELECTs are load balanced.
>>> load_balance_mode = true
>>>
>>> # If there's a disagreement with the packet kind sent from backend,
>>> # then degenrate the node which is most likely "minority".  If false,
>>> # just force to exit this session.
>>> replication_stop_on_mismatch = false
>>>
>>> # If there's a disagreement with the number of affected tuples in
>>> # UPDATE/DELETE, then degenrate the node which is most likely
>>> # "minority".
>>> # If false, just abort the transaction to keep the consistency.
>>> failover_if_affected_tuples_mismatch = false
>>>
>>> # If true, replicate SELECT statement when replication_mode or
>>> parallel_mode is enabled.
>>> # A priority of replicate_select is higher than load_balance_mode.
>>> replicate_select = false
>>>
>>> # Semicolon separated list of queries to be issued at the end of a
>>> # session
>>> reset_query_list = 'ABORT; DISCARD ALL'
>>> # for 8.2 or older this should be as follows.
>>> #reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT'
>>>
>>> # white_function_list is a comma separated list of function names
>>> # those do not write to database. Any functions not listed here
>>> # are regarded to write to database and SELECTs including such
>>> # writer-functions will be executed on master(primary) in master/slave
>>> # mode, or executed on all DB nodes in replication mode.
>>> #
>>> # black_function_list is a comma separated list of function names
>>> # those write to database. Any functions not listed here
>>> # are regarded not to write to database and SELECTs including such
>>> # read-only-functions will be executed on any DB nodes.
>>> #
>>> # You cannot make full both white_function_list and
>>> # black_function_list at the same time. If you specify something in
>>> # one of them, you should make empty other.
>>> #
>>> # Pre 3.0 pgpool-II recognizes nextval and setval in hard coded
>>> # way. Following setting will do the same as the previous version.
>>> # white_function_list = ''
>>> # black_function_list = 'nextval,setval'
>>> white_function_list = ''
>>> black_function_list = 'nextval,setval,currval,lastval'
>>>
>>> # If true print timestamp on each log line.
>>> print_timestamp = true
>>>
>>> # If true, operate in master/slave mode.
>>> master_slave_mode = true
>>>
>>> # Master/slave sub mode. either 'slony' or 'stream'. Default is 'slony'.
>>> master_slave_sub_mode = 'stream'
>>>
>>> # If the standby server delays more than delay_threshold,
>>> # any query goes to the primary only. The unit is in bytes.
>>> # 0 disables the check. Default is 0.
>>> # Note that health_check_period required to be greater than 0
>>> # to enable the functionality.
>>> delay_threshold = 10000000
>>>
>>> # 'always' logs the standby delay whenever health check runs.
>>> # 'if_over_threshold' logs only if the delay exceeds delay_threshold.
>>> # 'none' disables the delay log.
>>> log_standby_delay = 'if_over_threshold'
>>>
>>> # If true, cache connection pool.
>>> connection_cache = true
>>>
>>> # Health check timeout.  0 means no timeout.
>>> health_check_timeout = 20
>>>
>>> # Health check period.  0 means no health check.
>>> health_check_period = 0
>>>
>>> # Health check user
>>> health_check_user = 'nobody'
>>>
>>> # Execute command by failover.
>>> # special values:  %d = node id
>>> #                  %h = host name
>>> #                  %p = port number
>>> #                  %D = database cluster path
>>> #                  %m = new master node id
>>> #                  %M = old master node id
>>> #                  %% = '%' character
>>> #
>>> failover_command = ''
>>>
>>> # Execute command by failback.
>>> # special values:  %d = node id
>>> #                  %h = host name
>>> #                  %p = port number
>>> #                  %D = database cluster path
>>> #                  %m = new master node id
>>> #                  %M = old master node id
>>> #                  %% = '%' character
>>> #
>>> failback_command = ''
>>>
>>> # If true, trigger fail over when writing to the backend communication
>>> # socket fails. This is the same behavior of pgpool-II 2.2.x or
>>> # earlier. If set to false, pgpool will report an error and disconnect
>>> # the session.
>>> fail_over_on_backend_error = true
>>>
>>> # If true, automatically locks a table with INSERT statements to keep
>>> # SERIAL data consistency.  If the data does not have SERIAL data
>>> # type, no lock will be issued. An /*INSERT LOCK*/ comment has the
>>> # same effect.  A /NO INSERT LOCK*/ comment disables the effect.
>>> insert_lock = true
>>>
>>> # If true, ignore leading white spaces of each query while pgpool judges
>>> # whether the query is a SELECT so that it can be load balanced.  This
>>> # is useful for certain APIs such as DBI/DBD which is known to adding an
>>> # extra leading white space.
>>> ignore_leading_white_space = true
>>>
>>> # If true, print all statements to the log.  Like the log_statement option
>>> # to PostgreSQL, this allows for observing queries without engaging in full
>>> # debugging.
>>> #log_statement = false
>>> log_statement = true
>>>
>>> # If true, print all statements to the log. Similar to log_statement except
>>> # that prints DB node id and backend process id info.
>>> #log_per_node_statement = false
>>> log_per_node_statement = true
>>>
>>> # If true, incoming connections will be printed to the log.
>>> #log_connections = false
>>> log_connections = true
>>>
>>> # If true, hostname will be shown in ps status. Also shown in
>>> # connection log if log_connections = true.
>>> # Be warned that this feature will add overhead to look up hostname.
>>> log_hostname = true
>>>
>>> # if non 0, run in parallel query mode
>>> parallel_mode = false
>>>
>>> # if non 0, use query cache
>>> enable_query_cache = false
>>>
>>> #set pgpool2 hostname
>>> pgpool2_hostname = ''
>>>
>>> # system DB info
>>> system_db_hostname = 'localhost'
>>> system_db_port = 5433
>>> system_db_dbname = 'pgpool'
>>> system_db_schema = 'pgpool_catalog'
>>> system_db_user = 'pgpool'
>>> system_db_password = ''
>>>
>>> # backend_hostname, backend_port, backend_weight
>>> # here are examples
>>> #backend_hostname0 = 'host1'
>>> #backend_port0 = 5432
>>> #backend_weight0 = 1
>>> #backend_data_directory0 = '/data'
>>> #backend_hostname1 = 'host2'
>>> #backend_port1 = 5432
>>> #backend_weight1 = 1
>>> #backend_data_directory1 = '/data1'
>>>
>>> backend_hostname0 = '192.168.3.240'
>>> backend_port0 = 5432
>>> backend_weight0 = 1
>>> backend_data_directory0 = '/home/postgres/9.0.1_5432/data'
>>> backend_hostname1 = '192.168.3.240'
>>> backend_port1 = 5433
>>> backend_weight1 = 1
>>> backend_data_directory1 = '/home/postgres/9.0.1_5433/data'
>>> backend_hostname2 = '192.168.3.241'
>>> backend_port2 = 5432
>>> backend_weight2 = 1
>>> backend_data_directory2 = '/home/postgres/9.0.1/data'
>>>
>>> # - HBA -
>>>
>>> # If true, use pool_hba.conf for client authentication.
>>> enable_pool_hba = false
>>>
>>> # - online recovery -
>>> # online recovery user
>>> recovery_user = 'nobody'
>>>
>>> # online recovery password
>>> recovery_password = ''
>>>
>>> # execute a command in first stage.
>>> #recovery_1st_stage_command = ''
>>> #recovery_1st_stage_command = 'recovery_1st_stage.sh'
>>>
>>> # execute a command in second stage.
>>> recovery_2nd_stage_command = ''
>>>
>>> # maximum time in seconds to wait for the recovering node's postmaster
>>> # start-up. 0 means no wait.
>>> # this is also used as a timer waiting for clients disconnected before
>>> # starting 2nd stage
>>> recovery_timeout = 90
>>>
>>> # If client_idle_limit_in_recovery is n (n > 0), the client is forced
>>> # to be disconnected whenever after n seconds idle (even inside an
>>> # explicit transactions!) in the second stage of online recovery.
>>> # n = -1 forces clients to be disconnected immediately.
>>> # 0 disables this functionality(wait forever).
>>> # This parameter only takes effect in recovery 2nd stage.
>>> client_idle_limit_in_recovery = 0
>>>
>>> # Specify table name to lock. This is used when rewriting lo_creat
>>> # command in replication mode. The table must exist and has writable
>>> # permission to public. If the table name is '', no rewriting occurs.
>>> lobj_lock_table = ''
>>>
>>> # If true, enable SSL support for both frontend and backend connections.
>>> # note that you must also set ssl_key and ssl_cert for SSL to work in
>>> # the frontend connections.
>>> ssl = false
>>> # path to the SSL private key file
>>> #ssl_key = './server.key'
>>> # path to the SSL public certificate file
>>> #ssl_cert = './server.cert'
>>>
>>> # If either ssl_ca_cert or ssl_ca_cert_dir is set, then certificate
>>> # verification will be performed to establish the authenticity of the
>>> # certificate.  If neither is set to a nonempty string then no such
>>> # verification takes place.  ssl_ca_cert should be a path to a single
>>> # PEM format file containing CA root certificate(s), whereas ssl_ca_cert_dir
>>> # should be a directory containing such files.  These are analagous to the
>>> # -CAfile and -CApath options to openssl verify(1), respectively.
>>> #ssl_ca_cert = ''
>>> #ssl_ca_cert_dir = ''
>>>
>>> # Debug message verbosity level. 0: no message, 1 <= : more verbose
>>> debug_level = 0
>>>
>>> 以上、宜しくお願い致します。
>>> _______________________________________________
>>> pgpool-general-jp mailing list
>>> pgpool-general-jp @ sraoss.jp
>>> http://www.sraoss.jp/mailman/listinfo/pgpool-general-jp
>>
> 
> 
> 
> -- 
> --
> _/_/_/ 山下 大介/Yamashita Daisuke
> _/_/_/ 〒541-0054
> _/_/_/ 大阪市中央区南本町2丁目6番12号 サンマリオンNBFタワー8F
> _/_/_/ 株式会社サイバースター 情報システム部門
> _/_/_/ TEL:06-6241-7281 FAX:06-6241-7282
> _/_/_/ e-mail yamashita @ cyberstar.co.jp


pgpool-general-jp メーリングリストの案内