Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: make sources more pluggable #4035

Merged
merged 1 commit into from
Aug 16, 2023
Merged

Conversation

vito
Copy link
Contributor

@vito vito commented Jul 19, 2023

Sources are a pretty neat extension point, except there are a few code paths that hard-code against each type. This moves code around and adjusts interfaces so that Source implementations are self-contained and merely need to be registered with the *source.Manager.

Code change cliff's notes:

  • source.Source now exposes Schemes() []string instead of ID() string
    • It is an array to handle http:// and https://.
    • This could maybe be used for docker-image:// and oci-layout://, but I left them as-is for now (one implementation with two Sources registered with different configs).
  • *source.Manager now stores a mapping from schemes to Sources
  • *source.FooIdentifier types now live in each respective source/foo/ package
  • source.Source now creates source.Identifier
  • source.Identifier now implements Capture(cap *provenance.Capture, pin string)
  • source.Identifier now implements Scheme() instead of ID() (feels a bit clearer)

If this is merged I won't need #3960 anymore because I can just implement my own custom sources. Nevermind, these are orthogonal; I can't use a custom source to inject network config into ExecOp, and I can already override sources without this change. Merging #3960 would still be the cleanest solution for me, but I think this refactor is worthwhile anyway.

@vito vito force-pushed the pluggable-sources branch 4 times, most recently from 85f35d5 to e6ddffc Compare July 19, 2023 19:41
@vito vito marked this pull request as ready for review July 19, 2023 19:41
@vito
Copy link
Contributor Author

vito commented Jul 20, 2023

This is ready for a look; I've just been able to use it to implement custom sources out-of-band (assuming direct access to the *session.Manager). 🎉

Copy link
Member

@jedevc jedevc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

source/http/httpsource.go Outdated Show resolved Hide resolved
@jedevc jedevc requested a review from tonistiigi July 21, 2023 09:20
Copy link
Member

@jedevc jedevc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, PTAL @tonistiigi when you have the chance 🎉

@tonistiigi
Copy link
Member

Could you explain more how you plan to use this? I'm afraid that this can be used to create new custom source schemas that are incompatible with buildkitd, client and frontends. I'd like to avoid that there is a versioned LLB or some LLB with extensions that only works on some systems.

@vito
Copy link
Contributor Author

vito commented Aug 2, 2023

@tonistiigi As an alternative to #3960 we have overridden the HTTP and Git sources with implementations that support our container-to-container DNS scheme.

Right now we have to smuggle values through the existing Identifier fields, for example by setting the URL to a JSON encoded value containing the URL + other values. This refactor would allow us to just pass those values along as regular Identifier fields, and let us decide whether we want to replace the original sources or just add ours alongside it.

So tl;dr is we plan to use this as an escape hatch. There seems to be some interest in figuring how c2c/c2h/h2c networking could work in Buildkit, which makes sense to carefully design, I just don't want to be blocked until we figure it out. This way we can treat Dagger like a testing ground for ideas that can graduate into Buildkit, possibly after a total redesign or two.

@tonistiigi
Copy link
Member

I'm trying to avoid the case where there are source schemas that the LLB format does not actually support. Could we reorganize this so there is still only a fixed list of source types that SourceManager is initialized with instead of the Register pattern, but you can still (temporarily) switch the implementation for a specific schema.

@tonistiigi
Copy link
Member

Maybe something like

source.NewManager(source.Backend{
  Local: source.Source{},
  Git: source.Source{},
  HTTP: source.Source{},
})

@jedevc
Copy link
Member

jedevc commented Aug 15, 2023

@vito does @tonistiigi's suggestion sound good to you? I'm happy to pick up the last bits of refactoring to take this over the finish line if you're a bit busy atm.

I've been doing some work around git sources in #4142, and have been wishing for the improved structure from this PR 🎉

@vito
Copy link
Contributor Author

vito commented Aug 15, 2023

It kind of defeats the purpose of this PR, because we can already replace sources, we'd just like to not have to hack values into the existing Identifier types. It seems strange to me to have an ostensibly pluggable interface with a generic wire format (name + attrs) but then restrict it to a few blessed types. I get that it could lead to LLB formats that not every system understands, but that problem already exists today, it's just formalized with API caps, which we could continue to use, and try to avoid passing our "fancy" LLB into things that wouldn't understand it (e.g. other frontends). To me, the pros outweigh the cons; it doesn't seem like refactoring an internal API in this way would lead to a sprawl of incompatible formats.

In any case, we're not blocked on this anymore, so feel free to close it. I figured a regular old refactor wouldn't be controversial but maybe there are other downsides I'm not seeing. 🤷‍♂️

@jedevc
Copy link
Member

jedevc commented Aug 15, 2023

@vito I think I agree with you here, given that this PR doesn't really enable anything that wasn't possible before, I think it's fine as is.

To me, having the ability to extend the HTTPIdentifier and then handle it in a custom HTTPSource is much better than having to try and cram desired properties into other fields - that breaks the compatibility guarantee even more.


I also agree with @tonistiigi though, I think it's important that we try and steer away from splitting LLB by ending up with entirely custom sources, e.g. my-custom-scheme:// (though maybe that's something we shouldn't do in code). You could still fake that today! (by just completely messing with an identifier, and changing all the fields, while keeping the same scheme, or even just defining your own Worker.ResolveOp to define your own SourceOp) - arguably, having a proper way to add this scheme, keeps LLB compat better.

I'm happy to merge this, I think having a discussion around adding completely custom schemes and sources to match them falls outside of the scope of this PR, so we should continue that separately if we need to.

source/git/identifier.go Outdated Show resolved Hide resolved
source/http/identifier.go Outdated Show resolved Hide resolved
Sources are a pretty neat extension point, except there are a few code
paths that hard-code against each type. This moves code around and
adjusts interfaces so that Source implementations are self-contained and
merely need to be registered with the source.Manager.

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
@jedevc jedevc force-pushed the pluggable-sources branch from 9696f2e to 6b27487 Compare August 16, 2023 09:02
@jedevc
Copy link
Member

jedevc commented Aug 16, 2023

Rebased and squashed (and fixed up above nits), to prep for merging.

Copy link
Member

@jedevc jedevc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in - this refactor will make my life a lot more pleasant ❤️

Happy to write a follow-up PR if we're still concerned about custom schemes, but given this is all internal, people messing with these APIs should know what they're doing 👀

@jedevc jedevc merged commit 03ccf0f into moby:master Aug 16, 2023
@vito
Copy link
Contributor Author

vito commented Aug 16, 2023

@jedevc Thanks for tidying up those nits! 🙏

@vito vito deleted the pluggable-sources branch August 16, 2023 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants