Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service index does not reflect node updates #5450

Closed
banks opened this issue Mar 8, 2019 · 0 comments
Closed

Service index does not reflect node updates #5450

banks opened this issue Mar 8, 2019 · 0 comments
Labels
type/bug Feature does not function as expected
Milestone

Comments

@banks
Copy link
Member

banks commented Mar 8, 2019

In #3899 we added a new per-service index to improve the granularity of health watches.

In #5449 I use this further to improve the scalability of watching to rely on that index.

There is a subtle bug though with the current implementation that will become more apparent if we land #5449. It needs to be fixed either way.

Bug Summary

The per-service index is relied on to supply the X-Consul-Index for a health query, but it is not updated when a node registration changes so doesn't reflect that.

Reproduction

# Run a test agent
$ consul agent -dev

# In another terminal, register a service on a fake node via the catalog endpoint. An external registration isn't necessary to trigger the bug but it's easier to repro - this simulates another regular agent without fuss of setting one up and convincing it to update its config/node registration later.
$ cat node.json
{
  "Node": "other.node",
  "Address": "127.0.0.1",
  "Datacenter": "dc1",
  "NodeMeta": {
    "foo": "bar"
  },
  "Service": {
    "Service": "web",
    "Port": 1234
  }
}

$ curl -i -X PUT -d @node.json ' 'http://localhost:8500/v1/catalog/register'
HTTP/1.1 200 OK
Vary: Accept-Encoding
Date: Fri, 08 Mar 2019 16:46:39 GMT
Content-Length: 0

# Query health for that service, only dump headers for now, note the X-Consul-Index
$ curl -sSL -D - 'http://localhost:8500/v1/health/service/web' -o /dev/null
HTTP/1.1 200 OK
Content-Type: application/json
Vary: Accept-Encoding
X-Consul-Effective-Consistency: leader
X-Consul-Index: 27
X-Consul-Knownleader: true
X-Consul-Lastcontact: 0
Date: Fri, 08 Mar 2019 17:15:56 GMT
Content-Length: 833

# Update the NODE registration - we'll just tweak node meta for now. Service is unchanged.
$ cat node.json | jq '. | .NodeMeta.foo = "qux"' > node-modified.json

$ curl -i -X PUT -d @node-modified.json 'http://localhost:8500/v1/catalog/register'
HTTP/1.1 200 OK
Content-Type: application/json
Vary: Accept-Encoding
X-Consul-Effective-Consistency: leader
X-Consul-Index: 27
X-Consul-Knownleader: true
X-Consul-Lastcontact: 0
Date: Fri, 08 Mar 2019 17:18:21 GMT
Content-Length: 833

[
    {
        "Node": {
            "ID": "",
            "Node": "other.node",
            "Address": "127.0.0.1",
            "Datacenter": "dc1",
            "TaggedAddresses": null,
            "Meta": {
                "foo": "qux"
            },
            "CreateIndex": 27,
            "ModifyIndex": 37
        },
<snip>

Expected result

The X-Consul-Index of the last response should be at least 37 - the Node record is showing the updated Meta value so the response has changed yet the index hasn't.

Blocking clients that are between requests when the node meta update occurred would miss it.

This is because we don't update the service indexes for all services when a node registration changes.

If #5449 is landed then it will be even worse - currently blocked clients will no be unblocked on the node update either.

@banks banks added the type/bug Feature does not function as expected label Mar 8, 2019
@banks banks added this to the 1.4.4 milestone Mar 8, 2019
ShimmerGlass pushed a commit to ShimmerGlass/consul that referenced this issue Mar 9, 2019
Node updates were not updating the service indexes, which are used for
service related queries. This caused the X-Consul-Index to stay the same
after a node update as seen from a service query even though the node
data is returned in heath queries. If that happened in between queries
the client would miss this change.
We now update the indexes of the services on the node when it is
updated.

Fixes: hashicorp#5450
ShimmerGlass pushed a commit to ShimmerGlass/consul that referenced this issue Mar 10, 2019
Node updates were not updating the service indexes, which are used for
service related queries. This caused the X-Consul-Index to stay the same
after a node update as seen from a service query even though the node
data is returned in heath queries. If that happened in between queries
the client would miss this change.
We now update the indexes of the services on the node when it is
updated.

Fixes: hashicorp#5450
banks pushed a commit that referenced this issue Mar 11, 2019
Node updates were not updating the service indexes, which are used for
service related queries. This caused the X-Consul-Index to stay the same
after a node update as seen from a service query even though the node
data is returned in heath queries. If that happened in between queries
the client would miss this change.
We now update the indexes of the services on the node when it is
updated.

Fixes: #5450
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Feature does not function as expected
Projects
None yet
Development

No branches or pull requests

1 participant