Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need https.send built-in supporting TLS secured connections #1067

Closed
gnealhou opened this issue Nov 14, 2018 · 17 comments
Closed

Need https.send built-in supporting TLS secured connections #1067

gnealhou opened this issue Nov 14, 2018 · 17 comments

Comments

@gnealhou
Copy link

Expected Behavior

Open to suggestions, but some goals to keep in mind:

  • Goal is to get data from (typically large) remote services that require TLS security.
  • For security reasons, we probably don't want to store certificates and certificate keys as policy objects
  • For performance reasons, we probably don't want to read certificate/keys from file every time we call https.send built-in

So I have two alternatives. First:

https.config(certificate, key, CAArray, output), where:

  • certificate is the path to the certificate file
  • key is the path to the key for the certificate file
  • CA is a list/array of additional certificate authorities to add to the CA pool via .AppendCertsFromPEM(in addition to the system CA's loaded via x509.SystemCertPool)
  • output stores the results, but either (1) does not make key data visible if viewing output, or (2) is encrypted using a key generated at startup and stored in memory (not 100% secure, but better than storing it plaintext).

https.send(config, request, output), where:

  • config is the output of https.loadCerts, above
  • request is the same as the request parameter in http.send
  • output is the same as the output parameter in http.send

Second Option:

  • during OPA startup, specify a config file that has all the information necessary to create the multiple https configurations -- the name of the configuration, path to certificate, path to key file, additional CA's to load. For example:
    {['service1', "/etc/certs/service1-cert.pem", "/etc/certs/service1-key.pem", []],
    ['service2', "/etc/certs/service2-cert.pem", "/etc/certs/service2-key.pem", ["/etc/certs/internal-CA.pem"]}
  • https.send would take one of the configurations, above instead of the 'config' paramater. Example:
    https.send("service1",{"method": "get", "url": "http://www.openpolicyagent.org/", "headers": {"X-Foo":"bar", "X-Opa": "rules"}}, output)

Actual Behavior

Existing http.send built-in does not appear to support loading certificates, keys, or certificate authorities necessary to establish a TLS secured HTTPS connection.

Additional Info

A documented example would be nice, too.

@tsandall
Copy link
Member

I think we can satisfy the requirements without introducing new built-in functions.

We can extend the http.send built-in to provide TLS certificates as input. Something like...

# Grab runtime information for OPA that contains environment vars
runtime := opa.runtime()

# Execute HTTP request with certificate from environment var
response := http.send({
  "url": "https://example.com",
  "method": "get",
  "cert": runtime.env["MY_CLIENT_CERT"], 
})

The http.send implementation can lazily update the certificate pool as necessary. Similarly, we could support a cert_file parameter in the input to http.send that lazily loads certs from disk.

WDYT?

@BenderScript
Copy link
Contributor

Extending http.send would be more straightforward. And with the new runtime command we could get keys from environment variables.

@gnealhou
Copy link
Author

gnealhou commented Dec 4, 2018

We can extend the http.send built-in to provide TLS certificates as input. Something like...

The http.send implementation can lazily update the certificate pool as necessary. Similarly, we could support a cert_file parameter in the input to http.send that lazily loads certs from disk.

WDYT?

That's great. It covers the critical points -- supports TLS encrypted connections and reasonably efficient file usage. I'm OK with restarting OPA to reload new certs -- it shouldn't happen very often, and restarting OPA should be painless in most environments.

@tsandall
Copy link
Member

tsandall commented Dec 4, 2018

Cool! I'm going to mark this issue as "help wanted" since I think the work to implement it is fairly localized. We can certainly work on this in the next month or two if no one picks it up.

@BenderScript
Copy link
Contributor

I will take a stab at this one given I worked on http.send before

@BenderScript
Copy link
Contributor

A few design questions:

When a new CA root (self-signed or not) is given should we:

1 - patch the host's CA collection? This most likely requires root and is OS specific.
2 - patch only an in memory copy such as SystemCertPool()?
3 - Create a new cert pool such as x509.NewCertPool()?

@srenatus Seems to be using (3) but I have seen both (2) used in many client examples.

@gnealhou
Copy link
Author

gnealhou commented Jan 23, 2019 via email

@tsandall
Copy link
Member

I think 2 sounds good as long as the "patching" does not impact other calls. I'd imagine we could implement this by maintaining a cert pool for each unique http.send call. When constructing the cert pool, the implementation copies the system cert pool. WDYT?

@BenderScript
Copy link
Contributor

I certainly see value in both. For simplicity reason maybe I want to load new certs (or periodically) and be done and not to bother with getting every https call correct. Maybe I want to reload for specific calls (testing or walled garden comes to mind).

Loading once during start-up and patching is certainly easy. And certainly we can give the option to reload fresh for certain calls. During HTTPS testing I load/reload certs many times. But after I get all infra running I do not touch it anymore unless rotation is needed or need to on-board something new.

@tsandall
Copy link
Member

My thinking was to have a cache of cert pools on the built-in context. The cache would be keyed by the method, URL, certificate properties of the http.send call. When the built-in is evaluated, the cache would be populated if needed.

You bring up a good point that sometimes the system pool might not be wanted. We could make this behaviour optional. I'm not sure what the right default behaviour is.

Does this sound correct?

@BenderScript
Copy link
Contributor

Yes, but some details we still need to iron out.

The "certificate properties" part of the cache is not clear to me, what do you mean by that? Usually https objects are cached like any other based header directives.

@tsandall
Copy link
Member

If you have two http.send calls with the same method and URL but different certificate properties then the built-in needs to be able to disambiguate them. E.g.,

http.send({"url": "https://example.com/foo", "method": "get", "cert": "AAAA"}, response) and http.send({"url": "https://example.com/foo", "method": "get", "cert": "BBBB"}, response)

If you don't use the cert properties in the cache key, you'll get different answers depending on order of execution.

@BenderScript
Copy link
Contributor

I agree with that, but which cert properties should we deem unique? Certificates have lots of info and the file name or cert name is not enough to disambiguate.

@tsandall
Copy link
Member

I don't think we have to reach into the certificate itself. Only the properties that are provided as input to the built-in.

@BenderScript
Copy link
Contributor

I am ok but we need to document that things like below will results in false caches:

  • http.send("my-cert-ca.cert"....)

I change the contents of my-cert in the meantime (run script again or what have you) and the cached response for the second built in (below) would not be valid.

http.send("my-cert-ca.cert"...)

And certain things would result in no cache hits: same cert across two different file names.

@tsandall
Copy link
Member

tsandall commented Jan 24, 2019

The cache is only used within the context of a single policy query, so we'll get the behaviour we want.

The main thing in my mind is whether we should support reading certs by filename explicitly. If we do this, then people will end up hardcoding paths into their polices which will lead to pain down the road. E.g., in order to evaluate the policy you'll have to have your filesystem laid out correctly. EDIT: this is mitigated to some extent by mocking, so it's not a show-stopper.

@BenderScript
Copy link
Contributor

We will give the option to read certs from files and env variables. Certs are usually created as files to start with, so it is something people must be used to. I notice there are other built in functions that can take an inline cert. One example is crypto.x509.parse_certificates(string, array[object])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants