Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

too many pending timers #40

Closed
dotSlashLu opened this issue Apr 1, 2020 · 6 comments · Fixed by #57
Closed

too many pending timers #40

dotSlashLu opened this issue Apr 1, 2020 · 6 comments · Fixed by #57

Comments

@dotSlashLu
Copy link

dotSlashLu commented Apr 1, 2020

Hi, I'm using the master branch and encountered this error:

...
2020/04/01 16:24:46 [error] 6083#0: *417577 [lua] healthcheck.lua:18: add_target(): failed to add target: too many pending timers, context: init_worker_by_lua*
2020/04/01 16:24:46 [error] 6083#0: *417577 [lua] healthcheck.lua:18: add_target(): failed to add target: too many pending timers, context: init_worker_by_lua*
2020/04/01 16:24:46 [error] 6083#0: *417577 [lua] healthcheck.lua:18: add_target(): failed to add target: too many pending timers, context: init_worker_by_lua*
2020/04/01 16:24:46 [error] 6083#0: *417577 [lua] healthcheck.lua:18: add_target(): failed to add target: too many pending timers, context: init_worker_by_lua*
2020/04/01 16:24:46 [error] 6083#0: *417577 [lua] healthcheck.lua:18: add_target(): failed to add target: too many pending timers, context: init_worker_by_lua*
2020/04/01 16:24:46 [error] 6083#0: *417577 [lua] healthcheck.lua:18: add_target(): failed to add target: too many pending timers, context: init_worker_by_lua*
2020/04/01 16:24:46 [error] 6083#0: *417577 [lua] healthcheck.lua:18: add_target(): failed to add target: too many pending timers, context: init_worker_by_lua*
2020/04/01 16:24:46 [error] 6083#0: *417577 [lua] healthcheck.lua:18: add_target(): failed to add target: too many pending timers, context: init_worker_by_lua*
2020/04/01 16:24:46 [error] 6083#0: *417577 [lua] healthcheck.lua:18: add_target(): failed to add target: too many pending timers, context: init_worker_by_lua*
2020/04/01 16:24:48 [alert] 6083#0: 256 lua_max_running_timers are not enough
2020/04/01 16:24:48 [alert] 6083#0: 256 lua_max_running_timers are not enough
2020/04/01 16:24:48 [alert] 6083#0: 256 lua_max_running_timers are not enough
2020/04/01 16:24:48 [alert] 6083#0: 256 lua_max_running_timers are not enough
2020/04/01 16:24:48 [alert] 6083#0: 256 lua_max_running_timers are not enough
2020/04/01 16:24:48 [alert] 6083#0: 256 lua_max_running_timers are not enough
2020/04/01 16:24:48 [alert] 6083#0: 256 lua_max_running_timers are not enough
2020/04/01 16:24:48 [alert] 6083#0: 256 lua_max_running_timers are not enough
...

Is there a limit for how many targets I can add? Is it possible to add more than 2000 upstream servers?

seems to be related to locking_target_list when add_target

local _, terr = ngx.timer.at(0, run_fn_locked_target_list, self, fn)

@Tieske
Copy link
Member

Tieske commented Apr 1, 2020

typically when calling those delay-timers, from a single thread, the timers do not get executed because they need the thread to yield first. In that case, each time after adding 100 servers, do a ngx.sleep() (0 wait sleep), which will allow the timers to run, and be freed.

Are you creating the full 2000 entries from a single thread without yielding? Your timer limit is rather low with 256.

@Tieske
Copy link
Member

Tieske commented Apr 1, 2020

implementing a different async method could prevent it from happening, something like;

local run_async do
  local list = {}

  function handler(premature)
    local exec_list = list
    list = {}

    for _, tmr in ipairs(exec_list) do
      local ok, err = pcall(tmr[1], premature, unpack(tmr, 2, tmr.n))
      if not ok then
        ngx.log(ngx.ERR, "timer failure: ", err)
      end
    end
  end

  function run_async(...)
    local l = #list
    list[#l + 1] = { n = select("#", ...), ... }
    if l == 0 then
      local ok, err = ngx.timer.at(0, handler)
      if not ok then
        return ok, err
      end
    end
    return true
  end
end     

(top of head, untested)

@dotSlashLu
Copy link
Author

typically when calling those delay-timers, from a single thread, the timers do not get executed because they need the thread to yield first. In that case, each time after adding 100 servers, do a ngx.sleep() (0 wait sleep), which will allow the timers to run, and be freed.

Are you creating the full 2000 entries from a single thread without yielding? Your timer limit is rather low with 256.

Actually the original implementation uses sleep but sleep api is not available in the init phase, this change is introduced in 14b894a#diff-224173b5a7345f5a7a2eb86000f00616

Thanks for your suggestion to go async, I'll try in my application. Maybe it's better to change this library to use async?

@Tieske
Copy link
Member

Tieske commented Apr 2, 2020

@dotSlashLu So you are hitting the "not enough timers" in the init phase? (and for the record; the idea would be to implement that async code in the library 😄 , not in user code)

@hishamhm @locao the above code would probably resolve the error, but I'm not to happy with overall overhead. Any other ideas for reducing this timer issue?

@hishamhm
Copy link
Contributor

hishamhm commented Apr 7, 2020

@hishamhm @locao the above code would probably resolve the error, but I'm not to happy with overall overhead. Any other ideas for reducing this timer issue?

@Tieske It's been forever since I last looked at this but... doesn't the suggestion above launch more timers?

@Tieske
Copy link
Member

Tieske commented Apr 7, 2020

the idea would be wherever the code calls timer.at, it would instead call run_async. And that function collects all callbacks to run in a single ngx timer. Hence, if calling the lib repeatedly, without yielding, it will only use 1 timer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants