Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency Management #3371

Open
jaspervdj-luminal opened this issue Apr 12, 2021 · 6 comments
Open

Dependency Management #3371

jaspervdj-luminal opened this issue Apr 12, 2021 · 6 comments

Comments

@jaspervdj-luminal
Copy link
Contributor

I'd like to kick off some discussion around adding explicit dependencies to
Rego.

The primary use case I have in mind is sharing code. At @fugue we have a bunch
of places in our product where either internal or external users can edit and
add Rego code. The interface is usually just a text editor, so we're limited
to single files. This works really well but it becomes harder when you want
to share code in between policies. I have a strong suspicion we're not the
only ones in this position.

So what can we do to solve this?

One idea is of course building a good editor where users can edit a directory
tree or multiple files. This is complicated and leads to other issues, where
parts of the library of Rego code can be broken at any point, and so on.

Another alternative is allowing people to upload bundles, but this makes it
less accessible for less technical users, since bundles need to be created
using the CLI.

Since both of these options have significant downsides, I think there's a better
way: allowing OPA to retrieve policies from external sources. Here's a strawman
proposal.

We'll add a new require keyword that introduces explicit dependencies in
Rego. require tells the agent that a policy requires other Rego files or
bundles to be loaded.

I can imagine this could support a number of different protocols:

require "/absolute/path.rego"
require "relative/path.rego"
require "https://github.com/user/repo/v0.1.0/library.rego"
require "ipfs://..."
require "git://..."

HTTPS is probably the most useful one and would make for a good MVP. A nice
addition would be to optionally specify a cryptographic hash of the dependency
so we have certainty about the code we're running.

I first tried to make this part of import. However, surprisingly, it is
completely orthogonal from the import keyword. I think of import more as
aliasing log data.foo.bar.qux paths to just qux. You can require a bundle
and import different subpackages. Or you can require some files and not
import them at all. In either case, I think it's nice that we don't need to
further complicate import.

Of course, OPA needs to have an option to either turn this off fully or
make the cryptographic hash a requirement.

@tsandall
Copy link
Member

tsandall commented Jul 6, 2021

I took a stab at this last week and the results are quite promising. Here's an example w/ opa build:

asciicast

A few comments:

  • The implementation uses the github.com/open-policy-agent/opa/refactor package to namespace dependencies. Dependencies are namespaced under the current package. For example:
package x

import "https://openpolicyagent.org/x/lib.rego" as foo

p := foo.lib.p

Assuming lib.rego is defined as:

package lib

p := 7

The namespace for lib would be data.x.foo.lib.

  • The implementation uses import instead of a new keyword. I went with import because it felt like most of the ast package changes to introduce require were duplicating what we do for import. The only thing I don't like about using import is that the affect for URLs is different than the affect for paths (e.g., when you import "foo" as x, the statement defines a virtual document at <package>.x unlikely when you import data.foo as x which only creates an alias inside the current file.) This inconsistency could cause confusion for new users. However, at the same time, I could imagine them being confused by two separate keywords that at first glance mean the same thing.

  • This is just a prototype. There are a bunch of features that could be added: caching (dependencies are currently fetched on every compile), deduplication (if the same dependency is imported in multiple places, it will be duplicated currently), hash pinning, authentication (no authentication is supported currently), parallel fetching (dependencies are fetched in serial currently), additional protocols (http only right now), etc.

@srenatus
Copy link
Contributor

I think implementing something like this: https://deno.land/manual@v1.13.1/linking_to_external_code/integrity_checking should remediate many concerns brought forward in the GK discussion: availability is done (vendored copy), security too (integrity checks, committed vendor dir), latency/networking.

@lcarva
Copy link
Contributor

lcarva commented Nov 23, 2022

Adding an external dependency directly into the policy file via import statements has its short-comings. Taking inspiration from package managers for other languages, having a dedicated file where the dependencies are defined and, more importantly, pinned to a particular version/digest goes a long way in making the process manageable in the long run.

@anderseknert
Copy link
Member

ODM is a really cool side project by @johanfylling exploring this space. Early stage still, but definitely looks promising.

@jeffchao
Copy link

Ancient bump!

I was talking to @johanfylling about this offline. Putting notes here.

ODM seems like a good start and something we're starting to play with. That said, for developer experience, it would be nice to see:

  1. A central registry, something like npm, pkg.go.dev, etc, that can be viewable from a browser and collaborators can contribute to
  2. Which can be used in Rego files, specifically with the import reserved word.
  3. Lockfiles with versioning either sem ver or tied to SHA to start
  4. CLI too (e.g., ODM) to manage versions and dependencies
  5. Option to vendor dependencies (or not)

There are existing projects like Open Policy Containers [0], which approach it with OCI-compatible bundles which gets all the versioning goodness. It's already integrated with OPA, but at the agent layer I believe. This means, I can't pull down external/public policies and evaluate them directly in memory, in Go, for example. This would mean I would need to deploy a sidecar.

Ultimately, it would be nice to create the ability to view, share, and collaborate on policies. There are compliance standards and foundational infra practices (RBAC, ABAC, etc) which could benefit from reusable policies. Hashicorp with Sentinel [1] sort of has an attempt here and there's prior art in the osquery space as well [2].

[0] https://openpolicycontainers.com/
[1] https://registry.terraform.io/browse/policies
[2] https://fleetdm.com/queries

@anderseknert
Copy link
Member

Ultimately, it would be nice to create the ability to view, share, and collaborate on policies.

Agreed, although I think GitHub (or GitLab, or whatever) is the better place for that than some central registry. I like the way pkg.go.dev does this (as you mentioned) where they don't really manage a registry per se, but mirror content from GitHub / Git, and present it along with docs (which for Rego could be rendered from metadata annotations) and other metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

7 participants