[pgpool-hackers: 4134] Re: invalid degenerate backend request, node id : 2 status: [2] is not valid for failover
Tatsuo Ishii
ishii at sraoss.co.jp
Mon Feb 21 15:40:56 JST 2022
> Dear Tatsuo
>
> I am not an expert on C programming, but if I understand correctly, applying this debugger patch would mean doing it on a fresh copy of the pgpool code,
> re-compiling it and then running another instance of pgpool from that new code. If that’s the procedure, I am afraid, we might not be able to reproduce the
> bug in that new instance as it is so infrequent.
Ok.
> Is there a way we can debug this or get these variables in runtime? I highly doubt it as I already tried increasing the log level to debug5 but couldn’t find
> anything related but I still want to ask because you might know of better alternatives.
The only way is attaching debugger to the process when the issue is reproduced.
> If not, I will definitely try to run it in a fresh pgpool instance and leave the current buggy one running as it is. Just hoping for a better alternative.
I have found one case which reproduces the issue. Unfortunately the
version I used is 4.3, not 4.2 and the test case needs new option for
pcp_promote command which does not exist in 4.2. Anyway the step to
reproduce the issue is as follows (using 3 PostgreSQL):
(1) shutdown pgpool (node 0 is primary).
(2) set node 1 status to "down" in pgpool_status file so that node 1
is to be set down from pgpool's point of view.
(3) start pgpool.
(4) pcp_promote node -s 2
(5) shutdown node 1 PostgreSQL.
(6) Pgpool complains: invalid degenerate backend request, node id : 1 status: [2] is not valid for failover
(7) infinit health check failure on node 1.
I confirmed the private_backend_status for node 1 is "down", which
cause the issue by attaching debugger to the health check process for
node 1. So I think correct fix for this is removing the call to
pool_initialize_private_backend_status(). I will work on finding a way
to reproduce the issue on Pgpool-II 4.2.
> Thank you
>
> Anirudh
> On 21 Feb 2022, 2:20 AM +0100, Tatsuo Ishii <ishii at sraoss.co.jp>, wrote:
>
> Hi Anirudh,
>
> Hello Tatsuo
>
> Thank you so much for the explanation. I understand the problem but same as you, cannot understand the reason behind it. The same variable
> 'BACKEND_INFO(i).backend_status’ cannot have 2 different values. Maybe the VALID_BACKEND macro is false due to other checks that it has in its
> if else blocks. I am not sure. I will try to have a look.
>
> Maybe. It would be nice if you could attach debugger to the health check
> process which emits the log:
>
> invalid degenerate backend request, node id : 2 status: [2] is not valid for failover
>
> then examin variables which may be related to the issue. One of them I am especially interested in these:
>
> (1)
> static POOL_SESSION_CONTEXT * session_context = NULL;
>
> The variable in the health check is expected to be NULL. If not, we have trouble.
>
> (2)
> BACKEND_STATUS private_backend_status[MAX_NUM_BACKENDS];
>
> This is the backend status copy from the shared memory area.
> This is expected to be 1 or 2 (CONN_CONNECT_WAIT or CON_UP).
>
> BTW, I need to correct my explanation.
>
> my_backend_status[(backend_id). The local variable is initialized at
> the pgpool main process (src/main/pgpool_main.c) and inherited via
> fork() to the health check process.
>
> /*
> * Initialize backend status area. From now on, VALID_BACKEND macro can be
> * used. (get_next_main_node() uses VALID_BACKEND)
> */
>
> for (i = 0; i < MAX_NUM_BACKENDS; i++)
> {
> my_backend_status[i] = &(BACKEND_INFO(i).backend_status);
> }
>
> As you can see actually my_backend_status stores the pointer to
> BACKEND_INFO(i).backend_status. So *(my_backend_status[(backend_id)
> should be same as BACKEND_INFO(backend_id).backend_status.
>
> Actually my_backend_status is not initialized like this. health check process calls this:
> /* Initialize my backend status */
> pool_initialize_private_backend_status();
>
> It copies backend status in the shared memory area to the process
> local memory. This is necessary for pgpool child process which deals
> with client session so that they are not bothered by the status
> changes in the shared memory area in the middle of process. BUT this
> is not necessary for health check process. I am going to fix this.
>
> However I don't think this can explain the issue. As long as there's
> no difference between the local variable and the shared memory in the
> life time, the bug does not do anything bad, and I cannot think of
> any scenario which could difference between the variables.
>
> As for the reproducibility, you are right, in the past few months of development and testing, I have only seen this issue a couple times. Earlier, I
> thought it was something that I missed in the config but now that my setup is almost production ready, I opened this ticket for help. It is really
> difficult to reproduce but I currently have a setup running where this problem is persistent. I have tried by turning that replica off and on several
> times and it connects back to the cluster when it's up but I see the same error when it’s down.
>
> So, if you prefer I can help you investigate more on this by sharing some detailed logs and config options from my setup.
>
> Even though this issue comes up rarely, but this is important for me to resolve because my setup uses synchronous replication and if both of my
> replicas are down, I convert the replication mode from sync to async in the failover script. But if this bug shows up, then the failover script is not
> executed when the last replica goes down and my queries hang forever because they do not have a replica committing transactions in synchronous
> mode.
>
> So, if I can help you debug this by providing any logs or config or a direct access to my setup over a Zoom call, that’d be really great for me. Please
> let me know if this works for you.
>
> Thanks. Is it possible for you to apply attached patches so that
> pgpool produces additional log? (this will emit log whenever failover
> happens). The log basically emits information which can be obtained by
> debugger explained above.
>
> Thank you
>
> Anirudh
> On 19 Feb 2022, 2:33 AM +0100, Tatsuo Ishii <ishii at sraoss.co.jp>, wrote:
>
> Hello
>
> I had the following setup:
> 1 pgpool: 4.2.2
> 3 postgres nodes (all 12.5) - 1 primary, 2 replicas
>
> I turned off both replicas one after the other with a difference of a couple seconds. Failover was performed for one of the replicas but not for
> the other. For the second replica the healthcheck keeps on failing but it just keeps on performing the healthcheck in an infinite loop. When
> pgpool attempts the failover of the second replica, I see this error in the logs:
>
> invalid degenerate backend request, node id : 2 status: [2] is not valid for failover
>
> As far as I understand, status [2] is up, so the failover should be performed. Here¢s a snippet of the logs:
>
> You are right. Problem is, two different internal APIs which both extract the status do not agree.
> The error messages comes from here:
> src/main/pool_internal_comms.c: degenerate_backend_set_ex()
>
> for (i = 0; i < count; i++)
> {
> if (node_id_set[i] < 0 || node_id_set[i] >= MAX_NUM_BACKENDS ||
> (!VALID_BACKEND(node_id_set[i]) && BACKEND_INFO(node_id_set[i]).quarantine == false))
> {
> if (node_id_set[i] < 0 || node_id_set[i] >= MAX_NUM_BACKENDS)
> ereport(elevel,
> (errmsg("invalid degenerate backend request, node id: %d is out of range. node id must be between [0 and %d]"
> ,node_id_set[i], MAX_NUM_BACKENDS)));
> else
> ereport(elevel,
> (errmsg("invalid degenerate backend request, node id : %d status: [%d] is not valid for failover"
> ,node_id_set[i], BACKEND_INFO(node_id_set[i]).backend_status)));
>
> I think VALID_BACKEND(node_id_set[i]) returned false
> here. VALID_BACKEND returns false if the backend status in the shared
> memory area is neither 1 (waiting for connection) nor 2 (up and
> running). However BACKEND_INFO(node_id_set[i]).backend_status says
> backend_status is actually 2 (up and running). The strange things is,
> both VALID_BACKEND and BACKEND_INFO(node_id_set[i]).backend_status
> look into the same shared memory area. Let me explain.
>
> VALID_BACKEND is a macron to be expanded as:
>
> #define VALID_BACKEND(backend_id) \
> ((RAW_MODE && (backend_id) == REAL_MAIN_NODE_ID) || \
> (pool_is_node_to_be_sent_in_current_query((backend_id)) && \
> ((*(my_backend_status[(backend_id)]) == CON_UP) || \
> (*(my_backend_status[(backend_id)]) == CON_CONNECT_WAIT))))
>
> Since you are running pgpool in other than RAW_MODE, the macro first
> checks if pool_is_node_to_be_sent_in_current_query((backend_id)
> returns true. Here is the portion of the code (defined in
> src/context/pool_query_context.c).
>
> bool
> pool_is_node_to_be_sent_in_current_query(int node_id)
> {
> POOL_SESSION_CONTEXT *sc;
>
> if (RAW_MODE)
> return node_id == REAL_MAIN_NODE_ID;
>
> As I said earlier, RAW_MODE returns false. So it executes next line:
>
> sc = pool_get_session_context(true);
> if (!sc)
> return true;
>
> pool_get_session_context() returns false and this function returns
> true because health check process did not use "sessoion context". The
> session context only exists on pgpool child process which deal with
> user connections.
>
> Then the macro checks local variable:
> my_backend_status[(backend_id). The local variable is initialized at
> the pgpool main process (src/main/pgpool_main.c) and inherited via
> fork() to the health check process.
>
> /*
> * Initialize backend status area. From now on, VALID_BACKEND macro can be
> * used. (get_next_main_node() uses VALID_BACKEND)
> */
>
> for (i = 0; i < MAX_NUM_BACKENDS; i++)
> {
> my_backend_status[i] = &(BACKEND_INFO(i).backend_status);
> }
>
> As you can see actually my_backend_status stores the pointer to
> BACKEND_INFO(i).backend_status. So *(my_backend_status[(backend_id)
> should be same as BACKEND_INFO(backend_id).backend_status.
>
> However, as I said earlier, it seems they seem to disagree. At this
> point I can't think of any explanation for this.
>
> How often do you see this problem? It's reliably reproduced?
> I was not able to reproduce this so far.
>
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
> *
More information about the pgpool-hackers
mailing list