Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Remove magic strings from Source types #57

Closed

Conversation

pmengelbert
Copy link
Contributor

@pmengelbert pmengelbert commented Dec 11, 2023

This is a draft PR. It won't pass the CI since I have not yet modified the tests.

@cpuguy83 , can you take a look and let me know if this is roughly what you had in mind for the data structures?

Reference #50

spec.go Outdated Show resolved Hide resolved
spec.go Outdated Show resolved Hide resolved
spec.go Outdated Show resolved Hide resolved
load.go Outdated
@@ -44,11 +97,15 @@ func LoadSpec(dt []byte, env map[string]string) (*Spec, error) {
}

for name, src := range spec.Sources {
updated, err := lex.ProcessWordWithMap(src.Ref, args)
ref, ok := src.GetRef()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add something like source.processArgs(args)?

Copy link
Contributor Author

@pmengelbert pmengelbert Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Add source.processArgs(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done.

load.go Outdated Show resolved Hide resolved
spec.go Outdated Show resolved Hide resolved
spec.go Outdated Show resolved Hide resolved
@pmengelbert
Copy link
Contributor Author

pmengelbert commented Dec 15, 2023

One other question for when you come back -- is Cmd a source variant, or is an option for sources that have some kind of shell environment available?

I'm thinking it should be a subfield on, for example, SourceDockerImage and SourceContext. Thoughts?

@cpuguy83
Copy link
Member

cpuguy83 commented Dec 15, 2023 via email

@pmengelbert pmengelbert force-pushed the pmengelbert/remove_magic_strings/4 branch from 509d6e0 to e224a4e Compare December 15, 2023 15:33
@pmengelbert
Copy link
Contributor Author

Yes, sub field for SourceDockerImage.

Only that field? Theoretically someone could provide an oci-layout as a build context, right? In that case, you could probably run a command also. Should I bother with that? If so, are there any other variants that should have the Cmd subfield?

@cpuguy83
Copy link
Member

Lovely how me email client is so verbose :)

Only that field? Theoretically someone could provide an oci-layout as a build context, right?

They could but we don't really account for that today.
I wouldn't worry about it right now as that's a pretty advanced case.

@pmengelbert
Copy link
Contributor Author

pmengelbert commented Dec 15, 2023

  • Fix autogenerated content (changing field names)
  • Troubleshoot e2e tests
  • Add unit tests for new private source methods

docs/spec.schema.json Outdated Show resolved Hide resolved
// Only one should be non-nil at any given time. It is considered an error
// condition if more than one is non-nil, or if all are nil.
//
// === Begin Source Variants ===
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make a SourceKind interface that all of these implement? It seems like that could be useful

Copy link
Contributor Author

@pmengelbert pmengelbert Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make a SourceKind interface that all of these implement? It seems like that could be useful

I thought the same thing, but after discussion with Brian, we decided against it. That discussion is worth reading. It really made us both think hard about when to use an interface vs when not to. The tldr is:

  • No interface keeps the spec closer to simple data - makes things easier to reason about, especially when you're first perusing the code base.
  • The spec is a configuration file with a set of instructions, and should be "dumb" and not "smart"
  • Unmarshaling the config is much easier when it's just simple data
  • We still have to do switch statements to convert the source into the interface any time we need to reference or set the underlying data (ex when we do the string substitution for the build args).
  • In general, it's better not to fight go's lack of enums and unions. I personally wish the language had them, but the designers of the language made the decision not to implement them. Go's designers had a lot of opinions about how to keep the language simple (for example, error handling is manually handled everywhere; or the only possible loop is a for loop). Fighting against the grain can make things pretty awkward (even more awkward than the switch/case statements).
  • An interface is generally better suited to situations where
    • The call-site is not concerned with potential side-effects within the interface's implementation(s).
    • The behavior belongs to a separate package or an unrelated type
    • A failure to implement an interface on [a type that's added later] causes a compile-time error. That wouldn't happen in this case: say we have a SourceKind field on Source, and we add a new SourceKind, but forget to make it satisfy the interface. We only catch the error if we remember to test it, which is never a given, because humans are humans.

With all that said, I'm curious how you would see the interface functioning here. Can you think of an implementation that would actually buy us something for the trouble? It's quite possible I just had the wrong shape in mind when I was thinking of how to do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thank you for your detailed response here! I went back and read the original thread, and it was an interesting read. I agree with many of the points.

For the sake of discussion, I will explain what I was thinking, even though it perhaps isn't a perfect fit.


Instead of adding a SourceKind field to Source, we can keep the fields separate (as they are now) and add a helper method to upcast them into an interface when we select which one is set (I think this is fairly similar to what you were suggesting in one of your comments in the thread).

However, the upcasting would be done in such a way that we could get checks at compile time to ensure that each new source kind implements the SourceKind interface.

Here is an example of what I was thinking:

type SourceKind interface {
    Present() bool // this is perhaps one of the hackier parts, basically returns whether or not we have a nil interface,
                           // and has to be implemented for each source kind
    LLB() ...
    SetRef()
    ...
}

Then GetSource() looks like:

func (s *Source) GetSource() (SourceKind, error) {
	found := false
	var use SourceKind = nil
       // if a variant does not implement our interface, we will not be able to put it in the array here
      // without a compile-time error
	for _, s := range []SourceKind{src.DockerImage, src.HTTPS, ... /*add any new variants here*/ } {
		if s.Present() && found {
			return nil, errors.New("multiple variants set")
		} else if s.Present() {
			found = true
			use = s
		}
	}

	if use == nil {
		return nil, errors.New("all nil")
	}

	return use, nil
}

Then, anywhere we would use a switch statement, we use GetSource to select the proper variant. For example,

func (s *Source) NeedsToSelectSourceKind() {
      sourceToUse, err := s.GetSource()
      if err != nil {
          // handle error, these are good sanity checks
      }
      
       // do something with source to use


    }

This eliminates the switch statements and adds built-in checking to ensure that the set of source fields are behaving like a union (so this simplifies the duplicates/unset checking in my view). Also, anytime we add a new source, we only need to add it in one place to check that it implements the right functionality: we simply update the GetSource method to include that source in its array of options, and we will get compile-time errors if it doesn't satisfy the SourceKind interface.


That being said, I am kind of torn since the methods that would be added to SourceKind are sort of inappropriate. I.e., if SourceKind is part of the spec, as Brian was saying, it makes more sense to have a function that takes in the spec and creates llb state based on it, rather than having multiple smaller implementations delegated as methods defined ON a SourceKind. It is weird to have a spec that generates the llb state, it's a spec, not a state creator.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing we could do is on each of the pointer types implement something like

// naming is hard
type AsLLBState interface {
  AsState(p string, includes, excludes []string) (llb.State)
  IsDir() bool
}

Then the switch effectively becomes:

func GetSource(s *Spec, src Source, name string) (st llb.State, isDir bool, err error) {
  // validate all the things

  // load the source
  switch {
  case src.Build != nil:
    return src.Build.AsState(), src.Build.IsDir(), nil
  // other cases
  default:
    panic("something is wrong")
  }
}

I'd be fine with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After working on this a bit, we found that this works better.

// naming is hard
type AsLLBState interface {
  AsState(p string, includes, excludes []string) LLBGetter
  IsDir() bool
}

@adamperlin
Copy link
Contributor

My biggest high-level feedback so far is to ask if we can encapsulate the different sources under an interface, say SourceKind. In my opinion, that would make some of the code around handling the different source cases cleaner, as the cases would be split up into separate implementations of interface methods.

load.go Show resolved Hide resolved
load.go Outdated Show resolved Hide resolved
// Only one should be non-nil at any given time. It is considered an error
// condition if more than one is non-nil, or if all are nil.
//
// === Begin Source Variants ===
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thank you for your detailed response here! I went back and read the original thread, and it was an interesting read. I agree with many of the points.

For the sake of discussion, I will explain what I was thinking, even though it perhaps isn't a perfect fit.


Instead of adding a SourceKind field to Source, we can keep the fields separate (as they are now) and add a helper method to upcast them into an interface when we select which one is set (I think this is fairly similar to what you were suggesting in one of your comments in the thread).

However, the upcasting would be done in such a way that we could get checks at compile time to ensure that each new source kind implements the SourceKind interface.

Here is an example of what I was thinking:

type SourceKind interface {
    Present() bool // this is perhaps one of the hackier parts, basically returns whether or not we have a nil interface,
                           // and has to be implemented for each source kind
    LLB() ...
    SetRef()
    ...
}

Then GetSource() looks like:

func (s *Source) GetSource() (SourceKind, error) {
	found := false
	var use SourceKind = nil
       // if a variant does not implement our interface, we will not be able to put it in the array here
      // without a compile-time error
	for _, s := range []SourceKind{src.DockerImage, src.HTTPS, ... /*add any new variants here*/ } {
		if s.Present() && found {
			return nil, errors.New("multiple variants set")
		} else if s.Present() {
			found = true
			use = s
		}
	}

	if use == nil {
		return nil, errors.New("all nil")
	}

	return use, nil
}

Then, anywhere we would use a switch statement, we use GetSource to select the proper variant. For example,

func (s *Source) NeedsToSelectSourceKind() {
      sourceToUse, err := s.GetSource()
      if err != nil {
          // handle error, these are good sanity checks
      }
      
       // do something with source to use


    }

This eliminates the switch statements and adds built-in checking to ensure that the set of source fields are behaving like a union (so this simplifies the duplicates/unset checking in my view). Also, anytime we add a new source, we only need to add it in one place to check that it implements the right functionality: we simply update the GetSource method to include that source in its array of options, and we will get compile-time errors if it doesn't satisfy the SourceKind interface.


That being said, I am kind of torn since the methods that would be added to SourceKind are sort of inappropriate. I.e., if SourceKind is part of the spec, as Brian was saying, it makes more sense to have a function that takes in the spec and creates llb state based on it, rather than having multiple smaller implementations delegated as methods defined ON a SourceKind. It is weird to have a spec that generates the llb state, it's a spec, not a state creator.

@pmengelbert pmengelbert force-pushed the pmengelbert/remove_magic_strings/4 branch 2 times, most recently from 499bad4 to af12b8c Compare January 9, 2024 15:58
@pmengelbert
Copy link
Contributor Author

pmengelbert commented Jan 9, 2024

Looks like mounts aren't implemented correctly yet.

  • Troubleshoot Sources.[].DockerImage.Cmd.Mounts implementation.

@pmengelbert pmengelbert force-pushed the pmengelbert/remove_magic_strings/4 branch 3 times, most recently from 33ab5fd to 083b264 Compare January 9, 2024 17:19
@pmengelbert
Copy link
Contributor Author

pmengelbert commented Jan 9, 2024

  • Fix docs/examples specs.

@pmengelbert pmengelbert marked this pull request as ready for review January 9, 2024 21:48
@pmengelbert
Copy link
Contributor Author

@cpuguy83 @adamperlin This is ready for a real review. The spec changes are fully passing the tests and the examples are up to date. I will make a pass through tomorrow to add any unit tests that may be necessary.

I will also take a look at the code organization, as there may be some things commented out or out of place.

df.yml Outdated Show resolved Hide resolved
load.go Outdated Show resolved Hide resolved
source.go Outdated Show resolved Hide resolved
spec.go Outdated Show resolved Hide resolved
spec.go Show resolved Hide resolved
spec.go Show resolved Hide resolved
spec.go Outdated Show resolved Hide resolved
spec.go Outdated Show resolved Hide resolved
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
To pass the linter

Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Because of the patching PR.

Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
We can no longer reference other sources. Instead, we can specify a
`local` source with the current directory in order to build the
frontend.

Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Additionally, ensure that the defaults are filled correctly for the
`context` source type (which made the `local` type redundant).

Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
We're no longer using the schemes

Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Use them instead of pointers. Also, validate them up-front in the
`validate` function, so that they do not need to be checked later.

Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Also, validate sources recursively.

Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
@pmengelbert pmengelbert force-pushed the pmengelbert/remove_magic_strings/4 branch 2 times, most recently from 8372b4c to d43ff4b Compare January 12, 2024 16:43
Also:
- update `processArgs` to do substitutions inline.
- update `fillDefaults` to set the default Source.Path to `"."` when the
  Source type is Context.

Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
@pmengelbert pmengelbert force-pushed the pmengelbert/remove_magic_strings/4 branch from d43ff4b to d76f447 Compare January 12, 2024 16:47
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
}
case s.Git != nil:
fields := []*string{
&s.Git.URL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sub function usage seems like a bit of a run-around with pointers.
In all cases we can do something like:

s.Git.URL, err = lex.ProcessWordWIthMap(s.Git.URL, args)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +144 to +153
switch count {
case 0:
errs = goerrors.Join(errs, fmt.Errorf("no non-nil source variant"))
case 1:
// success condition
default:
errs = goerrors.Join(errs, fmt.Errorf("more than one source variant defined"))
}

return errs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
switch count {
case 0:
errs = goerrors.Join(errs, fmt.Errorf("no non-nil source variant"))
case 1:
// success condition
default:
errs = goerrors.Join(errs, fmt.Errorf("more than one source variant defined"))
}
return errs
switch {
case 0:
return goerrors.Join(errs, fmt.Errorf("no non-nil source variant"))
case 1:
return nil
default:
return goerrors.Join(errs, fmt.Errorf("more than one source variant defined"))
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the reason I didn't do this is because I check the mounts and the source types. If there are multiple errors I thought it might be useful to return all of them. Instead, I've changed it to report errors in the mounts immediately, and then added the above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(done)

count++
}
if s.Build != nil {
if s.Build.Source.Build != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe consider placing in a separate method, since this case is more involved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -22,6 +24,135 @@ func knownArg(key string) bool {

const DefaultPatchStrip int = 1

func (s *Source) processArgs(args map[string]string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we perhaps name this function something a little more specific? Maybe substituteArgs or similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants