Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consul takes forewer to reload on windows #3923

Closed
123BLiN opened this issue Feb 28, 2018 · 5 comments
Closed

Consul takes forewer to reload on windows #3923

123BLiN opened this issue Feb 28, 2018 · 5 comments
Labels
theme/windows Anything related to Windows type/bug Feature does not function as expected

Comments

@123BLiN
Copy link

123BLiN commented Feb 28, 2018

Description of the Issue (and unexpected/desired result)

consul reload command takes 15-20 min and utilizes much CPU time after it is running for 10-20 hours.
the issue is not reproducible right after consul agent restart.
number of services ~ 500 per agent + 1-2 health checks for each.
the issue is not reporducible with consul 1.0.0

linux may or may not be affected - we don't have as much services on it, but please let me know if I should try to reproduce.

Reproduction steps

consul reload command from the cmd after 10-20 hours in the running state

consul version for both Client and Server

Client: 1.0.6
Server: 1.0.6

Operating system and Environment details

Windows 2012R2

nothing strange in logs, but I will try to take a closer look later.

I'm ready to collect dumps and TRACE logs and so on to help identify the problem, but maybe someone could know what exact change since the version 1.0.0 may cause this?
Unfortunately we have rolled back to the version 1.0.0 for agents so I will need to set up separate test environment for reproduce.

Side note

consul reload for agent version 1.0.0 and ~500 services takes 2-3 sec
consul reload for agent version 1.0.6 and ~500 services takes 11 sec (in 5 min after restart)

Thanks,
Roman

@123BLiN
Copy link
Author

123BLiN commented Jun 5, 2018

Some improvements were made in the consul 1.0.7, now it lives longer - about a week until the issue occurs.
If this may help:
Fresh restarted consul process:
screenshot_1788
Every reload eats some memory, and takes a little more CPU time to complete - reload takes 2-3 sec

Then after a week or so (about 1-5 reloads per day to update configuration) :
screenshot_1789
Reload takes more then 10 min and we are forced to restart the process

trying to test latest 1.1.0 version - have just noticed it is released 😄

@123BLiN
Copy link
Author

123BLiN commented Jun 7, 2018

Unfortunately same behaviour on consul 1.1.0.
Reloads:
consul is running - no configuration was applied since start, memory - 70Mb

  1. Time - 15s, memory - 134Mb

  2. Time - 22s, memory - 134Mb,
    screenshot_1791

  3. Time - 33s, memory - 134Mb,

  4. Time - 42s, memory - 145Mb,
    screenshot_1793

  5. Time - 53s, memory - 155Mb,

  6. Time - 1m 4s, memory - 158Mb,
    Then I give it 5 min to cool down, the memory is dropped to 100Mb
    screenshot_1795

But then next try:
7. Time - 1m 13s, memory - 159Mb,
screenshot_1796

We are forced to downgrade consul agents to 1.0.0 on Windows again.
Please let me know if I can provide more details or logs to track down this issue.
It should be pretty easy to reporduce anywhere else - just create a consul cluster and one windows node with agent, agent should have about 200-300 services registered, but the issue is also noticed with <100 services as well.

@123BLiN 123BLiN changed the title Consul 1.0.6 takes forewer to reload on windows Consul takes forewer to reload on windows Jun 7, 2018
@pearkes pearkes added theme/windows Anything related to Windows type/bug Feature does not function as expected labels Jul 24, 2018
@pearkes
Copy link
Contributor

pearkes commented Jul 24, 2018

Thanks for your continued patience and explanations of what is happening here @123BLiN, we appreciate it. To confirm, you never see this at all in 1.0.0? It appears the different versions show slightly different timelines to see the issue materialize. Can clarify that you have never seen it on 1.0.0 or earlier? Just want to firmly rule that out to help with investigating it.

@123BLiN
Copy link
Author

123BLiN commented Jul 25, 2018

Confirmed - never was able to reproduce in 1.0.0.
I'm going to give 1.2.1 a try now. There are some interesting bugs fixed:
#4185

@123BLiN
Copy link
Author

123BLiN commented Aug 28, 2018

Hello, sorry for delay
1.2.2 - not able to reproduce
thanks a lot!
closing.

@123BLiN 123BLiN closed this as completed Aug 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/windows Anything related to Windows type/bug Feature does not function as expected
Projects
None yet
Development

No branches or pull requests

2 participants