Allow secure communication between components #458

jpkrohling · 2017-10-06T10:12:16Z

Update 2019-09-20: replaced by #1718

Document and/or implement secure communication channels[1] between components, like:

Tracer -> Collector
Agent -> Collector: only insecure gRPC connection from agent to collector possible #1310
Collector -> data store (Cassandra, Elasticsearch)
Query -> data store (Cassandra, Elasticsearch)
User -> UI
UI -> Query
Tracer -> Agent

This is related to #404 .

1 - TLS for HTTP, but not sure how it would work with thrift

jpkrohling · 2017-10-06T10:14:43Z

Current state:

Jaeger uses other libraries to handle the communication with remote components, like storage access (Cassandra / Elasticsearch). Its HTTP endpoints are not encrypted, which might/should be solved by using a reverse proxy in front of the component to be protected. The tracer components are able to send data via HTTPS to a remote collector server. The Agent is not able yet to send data to the collector using a secure communication channel.

Dieterbe · 2017-11-03T15:48:27Z

FYI my use case:

we have various k8s clusters in various locations (datacenters) in the US (though in the future likely in other continents) and using various cloud providers
in each k8s cluster we have various services generating spans and complete traces (currently a single trace always comes from 1 location only, we don't have requests that span across multiple locations, though that may come later)
we want 1 central jaeger deployment in our 'ops' cluster, it has a multi-node cassandra cluster in that cluster, the cluster does not extend into the other locations across the internet. everything (cassandra etc) is contained within that one location
i honestly don't care much whether to run the collector centrally in the ops cluster, or run collectors in each location and then those collectors talk to the central ops cassandra over the internet (though that does sound a bit weird). given that collectors can be scaled linearly (at least that's what it looks like) it seems better to run them centrally, and then have agents in each location talk over the internet to the collectors (that's also what people seem to be recommending)
obviously we don't want people to be able to sniff our jaeger traffic, because they contain confidential information, and we don't want anyone be able to send crap into our environment, so we need encryption+authentication.
this prompted me to enquire with my infra folks about a vpn/secure tunnel between the locations and the central ops cluster. they informed me of various limitations ("some of our cloud providers do not provide a layer2 network, so it is not possible to add custom routes", "in some locations the nodes/pods in each cluster cant connect out"), so they're asking instead for an application level solution such as https with auth.
i'm not sure if going from tchannel to http(s) is a big performance downgrade, a secure encrypted/authenticated tchannel would also for me I suppose.
simple solutions are good solutions, maybe it's just a matter of terminating ssl and basic auth via a kubernetes ingress

yurishkuro · 2017-11-03T21:56:10Z

Agent to collector path is using tchannel for legacy reasons. I would much rather use grpc, which will have standard support for https.

rbtcollins · 2018-01-11T15:12:57Z

Hi @Dieterbe we had much the same use case with a of variation:

we're deliberately not including confidential info (PII, customer data) in our traces: we want the traces to be accessible to all the teams and not have to try to do masking on a per-span basis!

So what we've done is deploy cassandra and the query centrally, and then put an agent on every node via a daemonset (to avoid the per-pod overheads of sidecars), and a collector ha pair for the whole k8s cluster, then used TLS client certs to secure the collector -> cassandra traffic, and the user -> query traffic.

We had to improve some bits of Jaeger to permit this, but I think they have all been merged now, though I haven't fully verified the dependency job change in prod for us (soon though).

We aren't worried about sniffing of agent -> collector traffic w/in our k8s clusters, and the rest is secured (or localhost only).

yurishkuro · 2018-05-05T17:30:42Z

Cf #773 for gRPC work.

One question I have about using HTTPS is what's the accepted practice for certificates? Are we ok to use some internally generated certificate for the servers in the collectors? If someone has a link to a blog post discussing this it would be appreciated.

sneko · 2018-05-06T15:17:34Z

@yurishkuro I'm using ES operator (https://github.com/upmc-enterprises/elasticsearch-operator) to manage ES clusters on Kubernetes. The operator can set up Kibana and Cerebro at same time while enabling secured communication over HTTPS.

They are using an opaque secret to store differents files related to certs:

Name:         es-certs-elasticsearch-cluster
Namespace:    logging
Labels:       <none>
Annotations:  <none>

Type:  Opaque

Data
====
kibana.pem:         1631 bytes
cerebro-key.pem:    1675 bytes
kibana-key.pem:     1679 bytes
cerebro.pem:        1631 bytes
node-key.pem:       1679 bytes
node-keystore.jks:  3506 bytes
node.pem:           1631 bytes
truststore.jks:     1032 bytes
ca-key.pem:         1679 bytes
ca.pem:             1367 bytes

Then they are mounting a volume at /elasticsearch/config/certs that Kibana/Cerebro can use.

I'm not sure you were expecting this kind of information, or if it's the best but that's a possible way to secure Jaeger <-> ESCluster 😃

jpkrohling · 2018-05-07T12:37:43Z

One question I have about using HTTPS is what's the accepted practice for certificates?

IMO, it's sufficient for us to just add a couple of configuration options:

what cert to offer to clients
what the key is to decrypt the content the client is sending
which CA cert to use when trusting server certs

Platforms like OpenShift and Kubernetes are able to generate certs on demand via an internal CA, as well as rotate the certs/keys based on certain rules. This is not the kind of knowledge we want within our code.

rbtcollins · 2018-05-07T19:57:37Z

@yurishkuro The key behavioural decisions deployers will be making are:

what CA to issue client certs from
- often becomes a custom trust root on the server side
what CA to issue server certs from (they can be different)
- if a private CA will be a trust root on the clients

This translates into the config options that @jpkrohling mentioned, though that list is incomplete.

The full set for a single direction of authentication is:

public side of cert
cert to present (aka private key)
trust chain for verification of certs (sometimes delivered in the public side of the server cert, but logically separate)
trust root (to be installed on the component verifying the creds. Note that for client certs this is typically supplied explicitly, even if it is a sub-CA, because you don't want any cert from the root CA to be considered a valid client.

So there are up to 8 unique config values in the most complex case of having two private CA's.

jpkrohling · 2018-05-08T09:02:08Z

The full set for a single direction of authentication is:

We are talking about different things here. You are probably talking about Mutual TLS authentication, whereas I'm talking about only encrypting the communication channel.

I still believe that auth should be handled at the infra layer. Mutual TLS Auth fits this scenario and can be easily accomplished by tools like Istio. At most, we should allow clients to send auth data (basic HTTP auth, bearer tokens), but that's it.

To allow secure communications, on the other hand, all we need to do is pass the cert data to the underlying handler, so, there's minimal code on our side.

rbtcollins · 2018-06-02T08:53:23Z

@jpkrohling If its just the channel that needs encrypting, OE can be used without any certificate authority at all: I believe you're really talking about authenticating the well known endpoint and encrypting the channel, otherwise no CA would be involved in the discussion.

There's minimal code on our side for handling client certificates as well: its really quite straight forward. I think that we should either say 'deploy all our components behind a service-mesh or similar layer, running only on localhost and using an outbound proxy', or support things fully. Doing half-a-TLS support is worse than none IMO because it leads folk into a setup that cannot grow with them.

jpkrohling · 2018-06-11T10:35:10Z

If its just the channel that needs encrypting, OE can be used without any certificate authority at all: I believe you're really talking about authenticating the well known endpoint and encrypting the channel, otherwise no CA would be involved in the discussion.

The Certificate Authority is to tell the client side of the communication that the cert being offered by the server is to be trusted. Otherwise, there could be a man in the middle intercepting the traffic. It's particularly relevant if the server certificate was generated by an internal CA like Kubernetes' Service CA.

(I think I should know what OE is about, but I'm currently having a blank...)

There's minimal code on our side for handling client certificates as well: its really quite straight forward

If we are just delegating CLI options to the underlying library, I'm all for it. But it should not be a feature of Jaeger.

Doing half-a-TLS support is worse than none IMO because it leads folk into a setup that cannot grow with them

Client Auth Cert is quite different and significantly more complex than just encrypting a pipe using TLS. I don't think we should mix this issue with auth at all.

jpkrohling · 2018-06-11T10:40:12Z

If we are just delegating CLI options to the underlying library, I'm all for it. But it should not be a feature of Jaeger.

I mean something like what is being requested by #678

justinclift · 2019-01-02T12:44:58Z

Looking at this too, for an initial small deployment on servers in Scaleway.

It seems like Jaeger Query doesn't (at present) have any support for clients wanting to access it via HTTPS/TLS.

That part should be fairly straight forward to implement, as (in the simplest case) it's just a slightly different Go library call. http.ListenAndServeTLS() instead of http.ListenAndServe()

The TLS version of the call just needs a certificate file and key file supplied.

For our use case, they'd be generated by LetsEncrypt. The cert and key files would be passed via command line, or config file argument. Something like:

--query.certificate-file string Path to the TLS certificate file
--query.certificate-key-file string Path to the key file for the TLS certificate

Does that sound reasonable? 😄

yurishkuro · 2019-01-02T17:23:43Z

We have precedent for TLS for storage, so should be using consistent flag names, e.g.

      --cassandra.tls                                   Enable TLS
      --cassandra.tls.ca string                         Path to TLS CA file
      --cassandra.tls.cert string                       Path to TLS certificate file
      --cassandra.tls.key string                        Path to TLS key file
      --cassandra.tls.server-name string                Override the TLS server name
      --cassandra.tls.verify-host                       Enable (or disable) host key verification (default true)

      --es.tls                                     Enable TLS
      --es.tls.ca string                           Path to TLS CA file
      --es.tls.cert string                         Path to TLS certificate file
      --es.tls.key string                          Path to TLS key file

justinclift · 2019-01-02T17:47:50Z

Ahhh. So more like this?

--query.tls.cert string   Path to TLS certificate file
--query.tks.key string    Path to TLS key file

justinclift · 2019-01-02T17:51:49Z

Hmmm, it should be possible to provide a query.tls.ca option as well, but I'd have to look into it more. Pretty sure it just means the TLS setup needs to be done a bit differently first, but that's from dodgy memory and it's been ages since I wrote TLS specific handling code. 🤷‍♂️

jpkrohling · 2019-01-15T11:42:42Z

An HTTP server typically sets only a cert (chain) and a key. The cert chain would include the CA that was used to sign the server's own cert and all upstream CAs.

iori-yja · 2019-02-28T14:28:52Z

TLS option is good for collector's http as well.

Use case:
I am trying to report from AWS lambda which is usually running outside of AWS VPC, which requires collector to listen to the internet request. To keep bearer secret, it would be nice to have TLS connection on tracer->collector communication.

jpkrohling · 2019-04-02T13:34:05Z

To keep bearer secret, it would be nice to have TLS connection on tracer->collector communication.

On the backend side, a reverse proxy could be used for this purpose. On the client side, the env var JAEGER_ENDPOINT can be used with some clients, where an HTTPS URL would be specified.

jpkrohling · 2019-04-02T13:36:14Z

With the inclusion of gRPC between the agent and the collector, I think this item is complete, missing only an official documentation about securing the UI/Query and about the communication between the client and agent.

tcolgate · 2019-06-07T13:38:34Z

The existing gRPC TLS code doesn't support authenticating the clients. In TLS terms, the normal thing to do is allow the clients to present a key/cert, and have the server verify that against a CA

I've taken the liberty of putting together a PR, #1591

yurishkuro · 2019-06-07T21:23:14Z

@tcolgate go for it!

We also need to implement some basic auth and/or API key

yurishkuro · 2019-09-20T16:55:59Z

replaced by #1718

karnveerayush · 2020-04-30T13:27:21Z

Hi All,

Is there a way to host Yaeger UI over HTTPS rather than HTTP?

If it is possible, what are the steps required to achieve that?

Thanks

pavolloffay · 2020-04-30T14:09:16Z

Would have to use a separate component to secure the UI/query service. Here is one blog post that might help you https://medium.com/jaegertracing/protecting-jaeger-ui-with-an-oauth-sidecar-proxy-34205cca4bb1?source=collection_detail----99735986d50-----37-----------------------

Also our operator k8s is able to take care of securing the UI.

yurishkuro changed the title ~~Allow security communitication between components~~ Allow secure communication between components Oct 6, 2017

yurishkuro added this to the Core Infra Best Practices milestone Nov 30, 2017

yurishkuro mentioned this issue Jun 29, 2018

Use Protobuf and gRPC for internal communications #773

Open

24 tasks

ghouscht mentioned this issue Feb 1, 2019

only insecure gRPC connection from agent to collector possible #1310

Closed

This was referenced Mar 1, 2019

HTTPS sender jaegertracing/jaeger-client-java#595

Closed

HTTPS Sender jaegertracing/jaeger-client-java#602

Closed

jpkrohling added the security label Apr 2, 2019

jpkrohling self-assigned this Apr 2, 2019

ecourreges-orange mentioned this issue Sep 20, 2019

How sampler.type=remote works #832

Closed

yurishkuro closed this as completed Sep 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow secure communication between components #458

Allow secure communication between components #458

jpkrohling commented Oct 6, 2017 •

edited by yurishkuro

Loading

jpkrohling commented Oct 6, 2017

Dieterbe commented Nov 3, 2017 •

edited

Loading

yurishkuro commented Nov 3, 2017

rbtcollins commented Jan 11, 2018

yurishkuro commented May 5, 2018

sneko commented May 6, 2018 •

edited

Loading

jpkrohling commented May 7, 2018

rbtcollins commented May 7, 2018

jpkrohling commented May 8, 2018

rbtcollins commented Jun 2, 2018

jpkrohling commented Jun 11, 2018

jpkrohling commented Jun 11, 2018

justinclift commented Jan 2, 2019 •

edited

Loading

yurishkuro commented Jan 2, 2019

justinclift commented Jan 2, 2019 •

edited

Loading

justinclift commented Jan 2, 2019

jpkrohling commented Jan 15, 2019

iori-yja commented Feb 28, 2019 •

edited

Loading

jpkrohling commented Apr 2, 2019

jpkrohling commented Apr 2, 2019

tcolgate commented Jun 7, 2019 •

edited

Loading

yurishkuro commented Jun 7, 2019

yurishkuro commented Sep 20, 2019

karnveerayush commented Apr 30, 2020

pavolloffay commented Apr 30, 2020

Allow secure communication between components #458

Allow secure communication between components #458

Comments

jpkrohling commented Oct 6, 2017 • edited by yurishkuro Loading

jpkrohling commented Oct 6, 2017

Dieterbe commented Nov 3, 2017 • edited Loading

yurishkuro commented Nov 3, 2017

rbtcollins commented Jan 11, 2018

yurishkuro commented May 5, 2018

sneko commented May 6, 2018 • edited Loading

jpkrohling commented May 7, 2018

rbtcollins commented May 7, 2018

jpkrohling commented May 8, 2018

rbtcollins commented Jun 2, 2018

jpkrohling commented Jun 11, 2018

jpkrohling commented Jun 11, 2018

justinclift commented Jan 2, 2019 • edited Loading

yurishkuro commented Jan 2, 2019

justinclift commented Jan 2, 2019 • edited Loading

justinclift commented Jan 2, 2019

jpkrohling commented Jan 15, 2019

iori-yja commented Feb 28, 2019 • edited Loading

jpkrohling commented Apr 2, 2019

jpkrohling commented Apr 2, 2019

tcolgate commented Jun 7, 2019 • edited Loading

yurishkuro commented Jun 7, 2019

yurishkuro commented Sep 20, 2019

karnveerayush commented Apr 30, 2020

pavolloffay commented Apr 30, 2020

jpkrohling commented Oct 6, 2017 •

edited by yurishkuro

Loading

Dieterbe commented Nov 3, 2017 •

edited

Loading

sneko commented May 6, 2018 •

edited

Loading

justinclift commented Jan 2, 2019 •

edited

Loading

justinclift commented Jan 2, 2019 •

edited

Loading

iori-yja commented Feb 28, 2019 •

edited

Loading

tcolgate commented Jun 7, 2019 •

edited

Loading