Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nomad Service Discovery Feedback #12589

Closed
mikenomitch opened this issue Apr 15, 2022 · 21 comments
Closed

Nomad Service Discovery Feedback #12589

mikenomitch opened this issue Apr 15, 2022 · 21 comments

Comments

@mikenomitch
Copy link
Contributor

mikenomitch commented Apr 15, 2022

With the launch of Nomad 1.3 Beta, we've added simple native service discovery to Nomad.

We're quite excited about the addition as is, especially for simple architectures and small workloads, but we plan to continue iterating on our current work. We won't replace Consul as a full service mesh, but we do want to make sure native service discovery can work for more users with real services in prod.

We've already identified some smaller improvements, such as advertising arbitrary addresses on Nomad services and adding template stanza helpers for load balancing and selecting single services, some mid-sized improvements like better readiness checks, and some larger-sized improvements like DNS support.

We would love to hear what else is most important to you though!

What would you want from Native Nomad Service Discovery? What would be the simplest UX and workflow that we could enable? What features or tweaks could we add to get you using this for real production workloads?

Please let us know!

@chenjpu
Copy link

chenjpu commented Apr 16, 2022

Modify service status

In this way, the service does not stop, but the state becomes DOWN, suitable for disabling the service.
DOWN forbidden
UP, start using

@mikenomitch
Copy link
Contributor Author

@chenjpu thanks for the feedback. If you don't mind sharing, what's your use case for this behavior?

And do you want to put the whole service into a "DOWN" state or just specific instances of the services?

@chenjpu
Copy link

chenjpu commented Apr 19, 2022

Specify the specific service for going down

@manhtukhang
Copy link

Hi @mikenomitch,
In our use case, we use Nomad SD to generate config for Nginx ingress. Each project is deployed to a separated namespace and the Nginx ingress is in a different namespace. The problem is that Nginx can not get info about other services in different namespaces.
I think Nomad should allow users to define which namespace/job can have the capacity to discover info about others namespaces services

@Himura2la
Copy link

Himura2la commented May 22, 2022

Hi! Thanks for the great feature! I'm glad I can avoid raising a Consul cluster for small projects now. There's one inconvenience I found trying to setup an nginx load ballancer.
In Consul-enabled clusters I use Traefik, and I can set Traefik configs for a service using key-value pairs, like this:

job "foo" {
  group "foo" {
      service {
        tags = [
          "traefik.enable=true",
          "traefik.http.routers.foo.rule=Host(`foo.example.org`)",
          "traefik.http.routers.foo.entrypoints=http"
        ]
      }
    }
}

Since tags is an array, everyone invents a syntax for making it key-value mapping. Fabio has another syntax for this (urlprefix-foo.example.org)

Now, when I should configure nginx manually, it becomes challenging to parse a key-value pairs from a plain array in Go templates. I found a workaround, I replaced keys with array indices. but it's not convenient:

    service {
      provider = "nomad"
      tags     = [
        "foo.example.org"  # server_name
      ]
    }
      template {
        data = <<EOF
{{ range nomadServices }}
upstream {{ .Name | toLower }} {
  {{- range nomadService .Name }}
  server {{ .Address }}:{{ .Port }};{{- end }}
}
server {
  listen 80;
  server_name {{ index .Tags 0 }};
  location / {
    proxy_pass http://{{ .Name | toLower }};
  }
}
{{ end -}}
EOF
}

I wish service.tags could be a key-value mapping.

@manhtukhang
Copy link

manhtukhang commented May 22, 2022

Hi @Himura2la,
Nginx ingress pack from nomad-pack-community-registry already had a template that could use both meta and tags for service ingress definition, please check if this is what you are looking for: https://github.com/hashicorp/nomad-pack-community-registry/blob/e3d1270ecb75719016d702e09086c75afc0d9238/packs/nomad_ingress_nginx/templates/_nginx_conf.tpl#L28-L48

But it still has had an issue as in my previous comment

@kolotaev
Copy link

It's a very useful feature especially for not-complex solutions that don't require a separate Consul service running. Simplicity and a light weight (in terms of resource consumption) are the primary reasons for choosing Nomad (at least for me). So thank you!

I'm happy that there's even a plan for improving this feature with new functionality. What I'm missing right now is the proper health-checks functionality (like in Consul).

@mikenomitch
Copy link
Contributor Author

mikenomitch commented Jun 15, 2022

@Himura2la thanks for the feedback - noted!

Also since you mentioned using Traefik with Consul, you should check out Traefik 2.8 which just shipped with Nomad support! https://traefik.io/blog/traefik-proxy-fully-integrates-with-hashicorp-nomad/

So you can use the same tags you used with the Consul integration.

@mikenomitch
Copy link
Contributor Author

mikenomitch commented Jun 15, 2022

@kolotaev glad it's useful! Regarding health checks, stay tuned :). The plan right now is to add native checks in Nomad 1.4.

EDIT: Worth noting that these native health checks will be combined "liveness" and "readiness" checks, at least initially. So the check response will dictate both whether Nomad treats the allocation as alive and whether it is ready to expose via service discovery.

@waquidvp
Copy link
Contributor

waquidvp commented Jun 15, 2022

Is it worth, by default, having the Nomad server as a service? This would be useful for putting Traefik in front of the Nomad UI using the native service discovery integration.

It would probably also require a config similar to consul.tags in order to be able to add tags.

@mikenomitch
Copy link
Contributor Author

@waquidvp good idea - we'll keep that in mind!

@iSchluff
Copy link

iSchluff commented Jul 7, 2022

I wish service.tags could be a key-value mapping.

Well we have that already for consul it's called service meta, and it would be also great to have for nomad service discovery for passing special information to templates. Use cases are e.g. custom prometheus and loki labels per service.

@mr-karan
Copy link
Contributor

mr-karan commented Jul 8, 2022

As @manhtukhang noted, extending service discovery across namespaces is what I'd like to see as well. It's quite natural to group different components of an application as different namespaces and often they need to talk to each other. Extending service discovery support for inter-namespace communication would be useful thing to have.

@legege
Copy link
Contributor

legege commented Aug 2, 2022

I second comments from @mr-karan and @manhtukhang: its quite unnatural to limit nomadServices and nomadService functions to the current job namespace.

@mikenomitch
Copy link
Contributor Author

Noted on cross namespace requests. This is something that we'll explore on our end. Limiting requests within namespaces helps maintain isolation, which is nice, but I recognize that this assumes a certain way of structuring your applications that might not apply to everybody.

Will chat with the team about this and report back with thoughts.

@burdandrei burdandrei unpinned this issue Aug 8, 2022
@Dgotlieb
Copy link
Contributor

Dgotlieb commented Aug 16, 2022

Hi,

Today we are using connect stanza in our Nomad jobs with Consul service mesh.
In the past, we had envoy connections limitations and to overcome this issue we had to change the envoy code, recompile our sidecar docker images and use our own version of connect.sidecar_image.

I'm not familiar with the underlying components of the native service discovery, but I'm wondering what are the connections limitations (if tested) and if it's something that can be configured.

I will just add that when we had our difficulties with the envoy sidecar we were able to monitor it via the sidecar logs. Where will the network and error logs or Prometheus metrics (if exist) can be observed when using the native service discovery?

Thanks!

@legege
Copy link
Contributor

legege commented Aug 16, 2022

@Dgotlieb Nomad Service discovery doesn't support Connect/envoy. You'll need to stay with Consul to keep this more advanced functionality.

@Dgotlieb
Copy link
Contributor

Thanks @legege
I know that Connect/envoy are not supported, but I'm still interested to know what are the networking limitations of the native service discovery and if networking logs and metrics can be observed.

Thanks

@benbourner
Copy link

Hi there, enjoying the new nomad service discovery. Have it working flawlessly in one environment, aside from the cross-namespace referencing, it works well for all our current needs. However, in another (dev) environment I have setup, I cannot for the life of me get {{ range nomadService "myservice" }} working. It always comes back empty, even though I can see the correct service registration in the GUI, and have managed it before :) FWIW I have server running on one machine, client running in another machine with a docker setup, everything deploys and works just fine, except this. Could this be the cause? I can't seem to find any way to diagnose why it could be failing. Can you perhaps point me in the right direction?

@jrasell
Copy link
Member

jrasell commented Aug 18, 2022

Hi everyone and thanks for all the interesting feedback and discussion.

I have raised a number of linked issues to cover the feature requests I believe have been mentioned in various comments and will now close this issue. If I have missed anything, or you have feature requests or bug reports in the future, please open a new issue so we can track this.

If you have any questions, please use the discuss forum to raise and discuss these. I note that @benbourner and @Dgotlieb have outstanding questions which I believe require a little back and forth. If you could raise these on the discuss forum I will respond straight away so we can work through them and also allow other community members to gain visibility.

Thanks,
jrasell and the Nomad team <3

@jrasell jrasell closed this as completed Aug 18, 2022
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests