Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update splaylimit during daemon run #9415

Closed
ImpBY opened this issue Jul 16, 2024 · 5 comments
Closed

Update splaylimit during daemon run #9415

ImpBY opened this issue Jul 16, 2024 · 5 comments
Labels
bug Something isn't working triaged Jira issue has been created for this

Comments

@ImpBY
Copy link

ImpBY commented Jul 16, 2024

This issue was originally filed due to a regression after merging #9345 and released in 8.7.0/7.31.0. The change was reverted in #9415 and released in 8.8.1 and 7.32.1. Since this issue contains a possible fix for the regression, we're repurposing this ticket for the original issue described in PUP-11728.

Describe the Bug

splay is recalculated even if the splay_limit has not changed
https://puppetcommunity.slack.com/archives/C0W298S9G/p1721059192632569

Expected Behavior

this leads to the fact that the probability of the first launch of the puppet increases over time.
upon reaching 1/3 of the time from the splay_limit it becomes almost 100%.
as a result the agent will perform the first run within 1/3 of the splay_limit. by default it is 10 min (splay_limit 30 min)

Steps to Reproduce

Steps to reproduce the behavior:

  1. add code to lib/puppet/scheduler/splay_job.rb
    def ready?(time)
     def ready?(time)
+      File.open("/tmp/splay", "a+") do |f|
+        f.write("splay: #{@splay}\n")
+        f.close
+      end
      if last_run
        super
      else
        start_time + splay <= time
      end
    end
  1. watch realtime changes of splay
# systemctl restart puppet
# tail -f /tmp/splay | awk -e '{ print strftime("%Y-%m-%d_%H:%M:%S",systime()) "\t" $0}'
2024-07-16_10:48:40	splay: 532
2024-07-16_10:48:44	splay: 532
2024-07-16_10:48:44	splay: 532
2024-07-16_10:48:49	splay: 532
2024-07-16_10:48:49	splay: 532
2024-07-16_10:48:53	splay: 1065
2024-07-16_10:48:53	splay: 1065
2024-07-16_10:48:54	splay: 1065
2024-07-16_10:48:54	splay: 1065
2024-07-16_10:48:59	splay: 1065
2024-07-16_10:48:59	splay: 1065
2024-07-16_10:49:04	splay: 1065
2024-07-16_10:49:04	splay: 1065
2024-07-16_10:49:08	splay: 847
2024-07-16_10:49:08	splay: 847
2024-07-16_10:49:09	splay: 847
2024-07-16_10:49:09	splay: 847
^C

Environment

  • Version 7.31.0
  • Platform Oracle Linux Server 9.4 (5.15.0-207.156.6.el9uek.x86_64)

Additional Context

suggested patch:

diff --git lib/puppet/scheduler/splay_job.rb lib/puppet/scheduler/splay_job.rb
index b44e08bad6..d2a5643324 100644
--- lib/puppet/scheduler/splay_job.rb
+++ lib/puppet/scheduler/splay_job.rb
@@ -1,6 +1,7 @@
 module Puppet::Scheduler
   class SplayJob < Job
     attr_reader :splay
+    attr_reader :splay_limit_previous

     def initialize(run_interval, splay_limit, &block)
       @splay = calculate_splay(splay_limit)
@@ -29,7 +34,10 @@ module Puppet::Scheduler
     # @return @splay [Integer] a random integer less than or equal to the splay limit that represents the seconds to
     # delay before next agent run.
     def splay_limit=(splay_limit)
-      @splay = calculate_splay(splay_limit)
+      if @splay_limit_previous != splay_limit
+        @splay_limit_previous = splay_limit
+        @splay = calculate_splay(splay_limit)
+      end
     end
@ImpBY ImpBY added the bug Something isn't working label Jul 16, 2024
@ImpBY
Copy link
Author

ImpBY commented Jul 16, 2024

image
jruby consumption before and after patch

~1200 agents

server params

# (optional) maximum number of JRuby instances to allow
max-active-instances: 32
max-queued-requests: 64
max-retry-delay: 120 ## seconds before retry

agents params

splay = true
splaylimit = 30m
runinterval = 30m

@sharewax
Copy link

sharewax commented Jul 16, 2024

issue has been added by this pull request #9345

@sharewax
Copy link

выява
how it looks like before/after and downgrade.

@joshcooper joshcooper changed the title @splay is recalculated even if the splay_limit has not changed (PUP-11728) Update splaylimit during daemon run Jul 25, 2024
@joshcooper joshcooper changed the title (PUP-11728) Update splaylimit during daemon run Update splaylimit during daemon run Jul 25, 2024
@joshcooper joshcooper added the triaged Jira issue has been created for this label Jul 25, 2024
Copy link

Migrated issue to PUP-12061

@joshcooper
Copy link
Contributor

Fixed in #9484 and will be released in 8.10.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triaged Jira issue has been created for this
Projects
None yet
Development

No branches or pull requests

3 participants