[pgpool-general: 7329] Re: Setup Pgpool2 with Postgresql Streaming Replication
Tatsuo Ishii
ishii at sraoss.co.jp
Wed Oct 28 09:26:06 JST 2020
(Cc:ed to pgpool-general)
> Hi Tatsuo,
>
> I have attached the additional files. Yes, pcp_recovery_command hangs after it prompts me for the postgres password and I enter the password. Then the cursor moves to the next line and it just hangs there. It doesn't return the " pcp_recovery_node -- Command Successful" message as it should.
Ok, let's invest this first. Can you share the pgpool log file right
after you hit pcp_recovery_node command, along with the exact command
line you hit?
If normal, you see something like these lines in the log file (note
that the following lines are from Pgpool-II 4.2 beta1. Pgpool-II 4.0
will show a little bit different log)
2020-10-28 09:19:18: main pid 19853: LOG: fork a new worker child process with pid: 19915
2020-10-28 09:19:24: pcp_main pid 19914: LOG: forked new pcp worker, pid=19919 socket=6
2020-10-28 09:19:24: pcp_child pid 19919: LOG: starting recovering node 1
2020-10-28 09:19:24: pcp_child pid 19919: LOG: executing recovery
2020-10-28 09:19:24: pcp_child pid 19919: DETAIL: starting recovery command: "SELECT pgpool_recovery('basebackup.sh', 'localhost', '/home/t-ishii/work/Pgpool-II/current/d/data1', '11002', 1, '11003')"
2020-10-28 09:19:24: pcp_child pid 19919: LOG: executing recovery
2020-10-28 09:19:24: pcp_child pid 19919: DETAIL: disabling statement_timeout
2020-10-28 09:19:26: pcp_child pid 19919: LOG: node recovery, 1st stage is done
2020-10-28 09:19:26: pcp_child pid 19919: LOG: checking if postmaster is started
2020-10-28 09:19:26: pcp_child pid 19919: DETAIL: trying to connect to postmaster on hostname:/tmp database:postgres user:t-ishii (retry 0 times)
2020-10-28 09:19:26: pcp_child pid 19919: LOG: checking if postmaster is started
2020-10-28 09:19:26: pcp_child pid 19919: DETAIL: failed to connect to postmaster on hostname:/tmp database:postgres user:t-ishii
2020-10-28 09:19:29: pcp_child pid 19919: LOG: checking if postmaster is started
2020-10-28 09:19:29: pcp_child pid 19919: DETAIL: trying to connect to postmaster on hostname:/tmp database:postgres user:t-ishii (retry 1 times)
2020-10-28 09:19:29: pcp_child pid 19919: LOG: node recovery, node: 1 restarted
2020-10-28 09:19:29: pcp_child pid 19919: LOG: received failback request for node_id: 1 from pid [19919]
2020-10-28 09:19:29: main pid 19853: LOG: Pgpool-II parent process has received failover request
2020-10-28 09:19:29: main pid 19853: LOG: starting fail back. reconnect host /tmp(11003)
2020-10-28 09:19:29: main pid 19853: LOG: Node 0 is not down (status: 1)
:
:
:
> Thanks,
> Alan
>
> -----Original Message-----
> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
> Sent: Saturday, October 24, 2020 2:21 AM
> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
> Cc: ishii at sraoss.co.jp
> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
>
> Hi Alan,
>
> I also need pg_hba.conf and recovery_1st_stage script.
>
> Also please answer my queston.
>
>>> but now I'm getting the new issue where it just hangs and does nothing.
>>
>> You mean pcp_recovery_command hangs?
>
>> The filename starts with "server1" which is actually the primary node "name: ltpgsql11". I was just interchanging the names for documentation purposes. I didn't include the standby node "name: ltpgsql12" configuration files because I think they should be the same except the host names are switched around in the files. I'm wondering if it's my .pgpass file location too which is currently stored in /home/postgres user. I think I saw somewhere in one of the scripts, the default path was set to a different location (/var/lib/...)?
>>
>> Thanks,
>> Alan
>>
>>
>> -----Original Message-----
>> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
>> Sent: Friday, October 23, 2020 11:53 PM
>> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
>> Cc: ishii at sraoss.co.jp; pgpool-general at pgpool.net
>> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
>>
>>> but now I'm getting the new issue where it just hangs and does nothing.
>>
>> You mean pcp_recovery_command hangs?
>>
>>> I suspect it's some settings in one of the configuration files (pgpool.conf, postgresql.conf, pg_hba.conf, pool_hba.conf, recovery_1st_stage, pgpool_remote_start, etc.) or system settings (authentication, etc.) that I applied when following the instructions in the documentation. Probably need to correct something.
>>
>> Can you share those files (including recovery.conf) so that I can examine them?
>>
>>> Yeah, I was probably restarting both pgpool and postgresql service repeatedly. Looks like after I used the scripts from your new link, it went away, but now I'm getting the new issue where it just hangs and does nothing. I suspect it's some settings in one of the configuration files (pgpool.conf, postgresql.conf, pg_hba.conf, pool_hba.conf, recovery_1st_stage, pgpool_remote_start, etc.) or system settings (authentication, etc.) that I applied when following the instructions in the documentation. Probably need to correct something.
>>>
>>> -----Original Message-----
>>> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
>>> Sent: Friday, October 23, 2020 11:21 PM
>>> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
>>> Cc: ishii at sraoss.co.jp; pgpool-general at pgpool.net
>>> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
>>>
>>> It seems the primary server was restarted again and again without any shutdown message. Pretty strange. Maybe you manually reset PostgreSQL or whole system repeatedly?
>>>
>>>> 2020-10-21 01:25:12 HST [865]: [1-1] 0 00000: LOG: starting PostgreSQL 12.3 (Ubuntu 12.3-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit
>>>> 2020-10-21 01:25:12 HST [865]: [2-1] 0 00000: LOG: listening on IPv4 address "0.0.0.0", port 6432
>>>> 2020-10-21 01:25:12 HST [865]: [3-1] 0 00000: LOG: listening on IPv6 address "::", port 6432
>>>> 2020-10-21 01:25:12 HST [865]: [4-1] 0 00000: LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.6432"
>>>> 2020-10-21 01:25:12 HST [865]: [5-1] 0 00000: LOG: redirecting log output to logging collector process
>>>> 2020-10-21 01:25:12 HST [865]: [6-1] 0 00000: HINT: Future log output will appear in directory "log".
>>>> 2020-10-21 01:27:52 HST [1505]: [1-1] 0 00000: LOG: starting PostgreSQL 12.3 (Ubuntu 12.3-1.pgdg16.04+1) on x86_6 4-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit
>>>> 2020-10-21 01:27:52 HST [1505]: [2-1] 0 00000: LOG: listening on IPv4 address "0.0.0.0", port 6432
>>>> 2020-10-21 01:27:52 HST [1505]: [3-1] 0 00000: LOG: listening on IPv6 address "::", port 6432
>>>> 2020-10-21 01:27:52 HST [1505]: [4-1] 0 00000: LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.6432"
>>>> 2020-10-21 01:27:52 HST [1505]: [5-1] 0 00000: LOG: redirecting log output to logging collector process
>>>> 2020-10-21 01:27:52 HST [1505]: [6-1] 0 00000: HINT: Future log output will appear in directory "log".
>>>> 2020-10-21 01:32:25 HST [2726]: [1-1] 0 00000: LOG: starting PostgreSQL 12.3 (Ubuntu 12.3-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit
>>>> 2020-10-21 01:32:25 HST [2726]: [2-1] 0 00000: LOG: listening on IPv4 address "0.0.0.0", port 6432
>>>> 2020-10-21 01:32:25 HST [2726]: [3-1] 0 00000: LOG: listening on IPv6 address "::", port 6432
>>>
>>>> Looking back at our emails, looks like I sent it on Wednesday, Oct. 21. Don't remember what time, maybe several hours before you responded.
>>>>
>>>> -----Original Message-----
>>>> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
>>>> Sent: Friday, October 23, 2020 10:58 PM
>>>> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
>>>> Cc: pgpool-general at pgpool.net
>>>> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
>>>>
>>>> The log started at 2020-10-18 22:42:46.989 HST and ended at 2020-10-23 06:46:54 HST.
>>>> When did you execute pcp_recovery_node exactly?
>>>>
>>>>> Hi Tatsuo,
>>>>>
>>>>> Here are the logs.
>>>>>
>>>>> Thanks,
>>>>> Alan
>>>>>
>>>>> -----Original Message-----
>>>>> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
>>>>> Sent: Friday, October 23, 2020 10:30 PM
>>>>> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
>>>>> Cc: pgpool-general at pgpool.net
>>>>> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Looks like I may have been able to run the pcp_recovery_node command, but after I entered the password, it goes to the next line and just stays there, no errors or anything. Maybe it's stuck?
>>>>>
>>>>> If you provide the log I requested, I can examine what's wrong.
>>>>>
>>>>>> 1) I need the PostgreSQL log while you are getting:
>>>>>>>> Ran this command on the primary server: pcp_recovery_node -h
>>>>>>>> 192.168.80.90 -p 9898 -U postgres -n 1 And received this error:
>>>>>>>> ERROR: recovery is checking if postmaster is started
>>>>>>>> DETAIL: postmaster on hostname:"ltpgsql12" database:"template1"
>>>>>>>> user:"postgres" failed to start in 90 second
>>>>>
>>>>>>I looked on the standby server node I'm trying to recover, with the "top" command and it says CPU usage is around 35% for the pg_basebackup command. It's been almost 5 hours like this. Does anyone know how long it takes? I heard that the standby recovery takes a long time, but does it really take this long? I have around 15 databases that totals about 300-500 MB of data that Pgpool2 is supposedly trying to recover right now.
>>>>>>
>>>>>> Thanks,
>>>>>> Alan
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
>>>>>> Sent: Thursday, October 22, 2020 1:43 AM
>>>>>> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
>>>>>> Cc: pgpool-general at pgpool.net
>>>>>> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
>>>>>>
>>>>>>> 2020-10-19 10:06:44.941 HST [24277] LOG: starting PostgreSQL
>>>>>>> 12.3 (Ubuntu 12.3-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled
>>>>>>> by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit
>>>>>>
>>>>>> The document is for CentOS, while you are using Ubuntu. I am not sure the documentation is useful for you or not. Anyway...
>>>>>>
>>>>>> 1) I need the PostgreSQL log while you are getting:
>>>>>>
>>>>>>>> I'm at step 7.3.8.1 Set up PostgreSQL standby server:
>>>>>>>> Ran this command on the primary server: pcp_recovery_node -h
>>>>>>>> 192.168.80.90 -p 9898 -U postgres -n 1 And received this error:
>>>>>>>> ERROR: recovery is checking if postmaster is started
>>>>>>>> DETAIL: postmaster on hostname:"ltpgsql12" database:"template1"
>>>>>>>> user:"postgres" failed to start in 90 second
>>>>>>
>>>>>> 2) Do you follow the step "7.3.4. Before Starting"? I am asking because you got this error:
>>>>>>
>>>>>> 2020-10-19 13:10:23.521 HST [6757] postgres at postgres DETAIL: User "postgres" has no password assigned.
>>>>>> Connection matched pg_hba.conf line 8: "host all all 0.0.0.0/0 md5"
>>>>>>
>>>>>> Have you executed below in the documentation?
>>>>>>
>>>>>> [server1]# psql -U postgres -p 5432
>>>>>> postgres=# SET password_encryption = 'scram-sha-256';
>>>>>> postgres=# CREATE ROLE pgpool WITH LOGIN;
>>>>>> postgres=# CREATE ROLE repl WITH REPLICATION LOGIN;
>>>>>> postgres=# \password pgpool
>>>>>> postgres=# \password repl
>>>>>> postgres=# \password postgres
>>>>>>
>>>>>>
>>>>>>
>>>>>> From: "Nguyen, Tu" <tnguyen3 at honolulu.gov>
>>>>>> Subject: RE: Setup Pgpool2 with Postgresql Streaming Replication
>>>>>> Date: Thu, 22 Oct 2020 09:35:36 +0000
>>>>>> Message-ID:
>>>>>> <F152F79048C4864E9825A767687750E404E82E97F0 at CchExMail2.cchnl.hnl>
>>>>>>
>>>>>>> Hi Tatsuo,
>>>>>>>
>>>>>>> Yes I'm using Pgpool-II version 4.0.
>>>>>>>
>>>>>>> Ssh doesn't seem to be the issue, I was able to test ssh
>>>>>>> successfully to both servers using this command from the new
>>>>>>> documentation: ssh postgres at serverX -i ~/.ssh/id_rsa_pgpool
>>>>>>>
>>>>>>> I have attached the postgresql-12-main.log file.
>>>>>>>
>>>>>>> After we get the online recovery task fixed, are there recommended step by step instructions on how to best setup Postgresql Streaming Replication for Pgpool-II 4.0 clustering? When I execute the "show pool_nodes" command, the replication_state and replication_sync_state values are blank and data is not being replicated.
>>>>>>>
>>>>>>> On another topic, can the Pgpool2 Connection Pooling feature handle 3,000-4,000 simultaneous connections effectively?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Alan
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
>>>>>>> Sent: Wednesday, October 21, 2020 10:21 PM
>>>>>>> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
>>>>>>> Cc: pgsql-general at postgresql.org
>>>>>>> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
>>>>>>>
>>>>>>> Hi Alan,
>>>>>>>
>>>>>>> This is not the best forum to discuss Pgpool-II related topics. I advice you to go to the Pgpool-II dedicated forum:
>>>>>>>
>>>>>>> https://www.pgpool.net/mailman/listinfo/pgpool-general
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm following the steps from:
>>>>>>>>
>>>>>>>> https://access.crunchydata.com/documentation/pgpool/4.0.0/exampl
>>>>>>>> e
>>>>>>>> -
>>>>>>>> c
>>>>>>>> l
>>>>>>>> u
>>>>>>>> s
>>>>>>>> ter.html
>>>>>>>
>>>>>>> This documentation looks pretty old. The original and the latest documentation for Pgpool-II 4.0 is here (I assume you are using Pgpool-II 4.0):
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> https://www.pgpool.net/docs/40/en/html/example-cluster.html
>>>>>>>
>>>>>>> I strongly suggest to look into this.
>>>>>>>
>>>>>>>> I'm at step 7.3.8.1 Set up PostgreSQL standby server:
>>>>>>>> Ran this command on the primary server: pcp_recovery_node -h
>>>>>>>> 192.168.80.90 -p 9898 -U postgres -n 1 And received this error:
>>>>>>>> ERROR: recovery is checking if postmaster is started
>>>>>>>> DETAIL: postmaster on hostname:"ltpgsql12" database:"template1"
>>>>>>>> user:"postgres" failed to start in 90 second
>>>>>>>>
>>>>>>>> How can I get this command to run successfully?
>>>>>>>
>>>>>>> Probably you have a problem with ssh settings. The newer and original document describes far detailed steps to set up ssh settings. Please take a look at "7.3.2. Requirements" section in the newer document.
>>>>>>>
>>>>>>> In the mean time to confirm that the problem is related to ssh, we need to look into the PostgreSQL log (not Pgpool-II log) on primary PostgreSQL node. Please share it (again, you'd better to post messages to the pgpool-general mailing list).
>>>>>>>
>>>>>>>> Also, when I ran this command: psql -p 5433 -c "show pool_nodes"
>>>>>>>> It shows the following pgpool2 node status, but when I did a
>>>>>>>> listing of databases on the primary and standby servers, I don't
>>>>>>>> see the databases on the primary replicated to the standby. How
>>>>>>>> can I setup the Postgresql Streaming Replication and check if it's working?
>>>>>>>
>>>>>>> Standby status is down because you failed to execute online recovery. You need to fix it.
>>>>>>>
>>>>>>>> node_id | hostname | port | status | lb_weight | role | select_cnt |
>>>>>>>> load_balance_node | replication_delay | replication_state |
>>>>>>>> replication_sync_state | last_status_change
>>>>>>>> ---------+-----------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>>>>>>>> 0 | ltpgsql11 | 6432 | up | 0.500000 | primary | 0 |
>>>>>>>> true | 0 | |
>>>>>>>> | 2020-10-21 11:56:48
>>>>>>>> 1 | ltpgsql12 | 6432 | down | 0.500000 | standby | 0 |
>>>>>>>> false | 0 | |
>>>>>>>> | 2020-10-21 11:56:48
>>>>>>>> (2 rows)
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Alan
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Sent from:
>>>>>>>> https://www.postgresql-archive.org/PostgreSQL-general-f1843780.h
>>>>>>>> t
>>>>>>>> m
>>>>>>>> l
>>>>>>>>
>>>>>>>>
More information about the pgpool-general
mailing list