[pgpool-general: 1456] Re: Fwd: Pgpool 3.2.2 issue on AIX
Tatsuo Ishii
ishii at postgresql.org
Tue Mar 5 07:52:23 JST 2013
> Hi Tatsuo, as promised I've tested pgpool 3.2.3 on AIX 5.2
>
> it behaves in a really strange way, at the beginning when I was starting
> pgpool it was seeming to be working pretty fine,
> after some stop and start it became to behave really strangely.
>
> it seems to mess with some cached status does exists some cache_status
> file? i removed backend files .s.PGSQL.9898 % .s.PGSQL.9999 butit doesn't
> change anything.
I don't think so. The cached status file is located at
/tmp/pgpool_status in your case. From the log, it says:
2013-03-04 17:44:18 ERROR: pid 72000: Could not read backend status file as /tmp/pgpool_status. reason: No such file or directory
So pgpool did not read cached status and this is normal. (BTW, if you
want to be sure to ignore the status file, you could start pgpool with
-D option).
>From the log file I noticed to the health check function failed to
connect to backend 1. Are you sure that backend1(devlam0) allows to
connect with user = postgres without password from the where pgpool is
running on?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
> No changes in pgpool.conf nor in postgresql.conf
> it returned to loose communication with the main process as described for
> 3.2.2 version, the main process abnormally exit
>
> I've tried to change some settings in pgpool.conf but it doesn't take
> effets on pool behaviour
>
> For example I tried to increase child life time cause i was suspecting some
> issue on child process destruction
>
> child_life_time = 100000000000
> # Pool exits after being idle for this
> many seconds
> child_max_connections = 0
> # Pool exits after receiving that many
> connections
> # 0 means no exit
> connection_life_time = 0
> # Connection to backend closes after
> being idle for this many seconds
> # 0 means no close
> client_idle_limit = 0
>
>
> I attach debug output truss output and pgpool.conf
>
> Hoping that someone can help me to solve this trouble on AIX.
>
> Thanks
>
> Great!
>>
>> Yes, I've just install the 3.2.3 et it seems to be working great!
>>
>> Obviouslly I will test it properly and I will let you know how this
>> version works on AIX.
>>
>> Thanks again
>>
>>
>>
>> 2013/2/22 Tatsuo Ishii <ishii at postgresql.org>
>>
>>> I'm going to check the data you posted.
>>>
>>> In the meatime, I think it is posiible your problem is caused by the
>>> bug fixed in pgpool-II 3.2.3, especiall if the problem goes away by
>>> disabling ealth checking. Can you try 3.2.3?
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese: http://www.sraoss.co.jp
>>>
>>> > Thanks for your quick reply Tatsuo,
>>> >
>>> > Before to get your reply i tried to understand where the main process
>>> stops
>>> > adding some print in the line right above every exit call in the main.c
>>> > class.
>>> > I found out that it never stops calling the exit command in an explicit
>>> > way.
>>> >
>>> > After your reply I tracked it's behaviour using truss.
>>> > Truss output is in the attched file.
>>> >
>>> >
>>> > Commenting the dup2 file in the demonize function, I'd been able to get
>>> > which error happens just before the process "death".
>>> >
>>> > This is the error:
>>> >
>>> > pool_flush_it: write failed to backend (0). reason: Socket is not
>>> connected
>>> > offset: 0 wlen: 41
>>> >
>>> > it happens after the first health_check call.
>>> >
>>> > it seems that the socket is not connected to the local backend (backend
>>> 0
>>> > is on the same host where pgpool is running) but stragely the
>>> replication
>>> > on the backend 0 it work normally so I think that it's connected.
>>> >
>>> > The real trouble is that pgppol will never check for failover or
>>> failback
>>> > loosing the main process.
>>> >
>>> > The pgsql version of every backend is 8.3. I attach the pgpool config
>>> file
>>> > too.
>>> >
>>> >
>>> > Thanks again for your help
>>> >
>>> >
>>> > --
>>> >
>>> > Daniele Di vito
>>> >
>>> >
>>> > 2013/2/21 Tatsuo Ishii <ishii at postgresql.org>
>>> >
>>> >> > HI everybody, I've compiled pgpool 3.2.2 on AIX 5.2.
>>> >> >
>>> >> > I configured the pool for using replication mode. The configuration
>>> is
>>> >> > working really fine on some linux virtual machine, but when I try to
>>> use
>>> >> > pgpool with the same configuration on AIX I have a big trouble.
>>> >> >
>>> >> > Starting with "pgpool -d" the server seems to be starting normally.
>>> it
>>> >> > create pcp process and it create the pool connections waiting for
>>> >> > connection requests.
>>> >> >
>>> >> > When I lunch a "ps -fu postgres | grep pgpool" i get this output:
>>> >> >
>>> >> >
>>> >> >
>>> >> > postgres 62164 1 0 10:05:48 - 0:00 pgpool: wait for
>>> >> > connection request
>>> >> > postgres 75470 1 0 10:05:48 - 0:00 pgpool: PCP: wait
>>> for
>>> >> > connection request
>>> >> > postgres 84072 1 0 10:05:48 - 0:00 pgpool: wait for
>>> >> > connection request
>>> >> > postgres 96828 1 0 10:05:48 - 0:00 pgpool: wait for
>>> >> > connection request
>>> >> > postgres 100026 1 0 10:05:48 - 0:00 pgpool: wait for
>>> >> > connection request
>>> >> > postgres 106670 1 0 10:05:47 - 0:00 pgpool: wait for
>>> >> > connection request
>>> >> > postgres 109864 1 0 10:05:48 - 0:00 pgpool: worker
>>> process
>>> >> > postgres 116412 1 0 10:05:48 - 0:00 pgpool: wait for
>>> >> > connection request
>>> >> >
>>> >> > but, as you can see looking at the output listed above,no pgpool
>>> daemon
>>> >> is
>>> >> > running and every subprocess created by it now have as ppid 1.
>>> >> >
>>> >> > if I look into the pgpool.pid i get a pid that is not running on the
>>> AIX
>>> >> > machine.
>>> >> > Obviously if i try to stop pgpool it says that the process is not
>>> >> running
>>> >> > so i have to kill every process and to remove every temporary file
>>> >> manually.
>>> >> >
>>> >> > If i run it without a daemon using "pgppool -n"
>>> >> >
>>> >> > the pgpool -n process is listed for some minutes in the "ps -fu
>>> >> postgres |
>>> >> > grep pgpool" and every subprocess have the right ppid.
>>> >> > Some minutes later i get the same output I listed for the "pgpool -d"
>>> >> > command start.
>>> >> >
>>> >> > Any idea on how to solve this trouble?
>>> >> >
>>> >> > I've already tried to find some error while in debug mode, but no
>>> error
>>> >> > listed.
>>> >>
>>> >> Does AIX have something like "strace" or "truss"? If so, taking a
>>> >> system call trace by using it, may provide valuable information. You
>>> >> take system call trace until pgpool-II parent process disappears.
>>> >> --
>>> >> Tatsuo Ishii
>>> >> SRA OSS, Inc. Japan
>>> >> English: http://www.sraoss.co.jp/index_en.php
>>> >> Japanese: http://www.sraoss.co.jp
>>> >>
>>>
>>
>>
More information about the pgpool-general
mailing list