Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Service does not exist' should not be fatal #769

Closed
nicholascapo opened this issue Mar 9, 2015 · 12 comments
Closed

'Service does not exist' should not be fatal #769

nicholascapo opened this issue Mar 9, 2015 · 12 comments
Labels
type/bug Feature does not function as expected

Comments

@nicholascapo
Copy link

Occasionally we have a server that crashes or is rebooted un-cleanly. On restart consul dies with an error:

==> Starting Consul agent...
==> Error starting agent: ServiceID "foo" does not exist

This requires manual intervention to remove the service from the consul data-dir and restart the services.

This error should not be fatal, and should simply result in the service not being available in the catalog.

An additional note: In this case the services are registered using the HTTP API not the config files.

@cetex
Copy link

cetex commented Mar 11, 2015

Same here, it happens often on rebooting the cluster (which i'm doing due to testing)
Rebooted the cluster and got this error.
root@s2:~# consul agent -server $CONSUL_BOOTSTRAP -log-level=debug -bind=:: -advertise=10.255.255.$CONSULID -data-dir $CONSUL_STORAGE -ui-dir /usr/local/share/consul-web/dist/ -retry-join=10.255.255.1 -retry-join=10.255.255.2 -retry-join=10.255.255.3 -retry-join=10.255.255.4 -retry-join=10.255.255.5 -client=::
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting Consul agent...
==> Error starting agent: ServiceID "aurora" does not exist

Current status: Can't start any consul agent, datacenter is currently down. consul data-dir most likely needs to be erased and kv/store needs to be recreated from scratch..

@armon armon added the type/bug Feature does not function as expected label Mar 11, 2015
@armon
Copy link
Member

armon commented Mar 11, 2015

Thanks for the report, tagging as a bug!

@cetex
Copy link

cetex commented Mar 11, 2015

Great. How tricky is this to fix?
It's currently one of (very few things) holding us back from moving forward with deploy on consul of the rest of the DC.

@ryanuber
Copy link
Member

I don't think this one will be too hard to fix. I'll take a look at it soon, although you'd have to build from master to get the fix since we have no release dates planned yet.

@cetex
Copy link

cetex commented Mar 11, 2015

sure. i can build it and distribute it from my own repo as long as there's something that works. :)

@ryanuber
Copy link
Member

Fixed by 04a2fae. We now warn and purge the erroneous check. Thanks!

@nicholascapo
Copy link
Author

Sweet, thanks Ryan

On Wed, Mar 11, 2015, 18:27 Ryan Uber notifications@github.com wrote:

Closed #769 #769.


Reply to this email directly or view it on GitHub
#769 (comment).

@cetex
Copy link

cetex commented Mar 12, 2015

Awesome, way faster than i thought possible ;)

@tonglil
Copy link

tonglil commented Mar 2, 2016

Mm, I think this is still happening?

==> Starting Consul agent...
==> Error starting agent: Failed to register check 'CPU Info Check': ServiceID "key-value store" does not exist &{cpu CPU Info Check This check reports the stats of the CPU, and does not check for unhealthy state. key-value store   {/etc/consul.d/monitor cpu   10m0s   0 0  This check reports the stats of the CPU, and does not check for unhealthy state.}}

I defined the service in a check.

{
  "check": {
    "id": "cpu",
    "name": "CPU Info Check",
    "script": "/etc/consul.d/monitor cpu",
    "interval": "10m",
    "service_id": "key-value store",
    "notes": "This check reports the stats of the CPU, and does not check for unhealthy state."
  }
}

Running

Consul v0.6.3
Consul Protocol: 3 (Understands back to: 1)

@ryanuber
Copy link
Member

ryanuber commented Mar 2, 2016

Hi @tonglil,

Is this check inside of a config file? This fix is only to purge health checks which were submitted from the REST API, and whose services no longer exist when Consul is restarted. If the check is defined in a configuration file, you have to remove the check or the config file which contains it.

Hope that helps.

@tonglil
Copy link

tonglil commented Mar 3, 2016

Ah gotcha, thanks for the clarification @ryanuber.

Do you know what the reasoning is for making it fatal for config files?

@ryanuber
Copy link
Member

ryanuber commented Mar 3, 2016

No problem. I think it was mainly just the most logical course of action when a check is attempted to register against an unknown service. You would otherwise get a running Consul agent without the check or the service you specified, which seems rather unexpected. Let me know if I missed anything!

duckhan pushed a commit to duckhan/consul that referenced this issue Oct 24, 2021
AKS seems to have a bug with 1.19+ and hostPorts: Azure/AKS#2070
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Feature does not function as expected
Projects
None yet
Development

No branches or pull requests

5 participants