Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

livecheck: add Options class #19293

Merged
merged 7 commits into from
Feb 25, 2025
Merged

livecheck: add Options class #19293

merged 7 commits into from
Feb 25, 2025

Conversation

samford
Copy link
Member

@samford samford commented Feb 11, 2025

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same change?
  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your changes? Here's an example.
  • Have you successfully run brew style with your changes locally?
  • Have you successfully run brew typecheck with your changes locally?
  • Have you successfully run brew tests with your changes locally?

This adds a Livecheck::Options class, which is intended to house various configuration options that are set in livecheck blocks, conditionally set by livecheck at runtime, etc. The general idea is that when we add features involving configurations options (e.g., for livecheck, strategies, curl, etc.), we can make changes to Options without needing to modify parameters for strategy find_versions methods, Strategy methods like page_headers and page_content, etc. This is something that I've been trying to improve over the years and Options should help to reduce maintenance overhead in this area while also strengthening type signatures.

Options replaces the existing homebrew_curl option (which related strategies pass to Strategy methods and on to curl_args) and the new url_options (which contains post_form or post_json values that are used to make POST requests). I recently added url_options as a temporary way of enabling POST support without Options but this restores the original Options-based implementation.

Along the way, I added a homebrew_curl parameter to the url DSL method, allowing us to set an explicit value in livecheck blocks. This is something that we've needed in some cases but I also intend to replace implicit/inferred homebrew_curl usage with explicit values in livecheck blocks once this is available for use. My intention is to eventually remove the implicit behavior and only rely on explicit values. That will align with how homebrew_curl options work for other URLs and makes the behavior clear just from looking at the formula/cask file.

Lastly, this removes the unused rest parameter from find_versions methods. I originally added unused as a way of handling parameters that some find_versions methods have but others don't (e.g., cask in ExtractPlist), as this allowed us to pass various arguments to find_versions methods without worrying about whether a particular parameter is available. This isn't an ideal solution and I originally wanted to handle this situation by only passing expected arguments to find_versions methods but there was a technical issue standing in the way. I recently found an answer to the issue, so this also replaces the existing ExtractPlist special case with generic logic that checks the parameters for a strategy's find_versions method and only passes expected arguments.

Replacing the aforementioned find_versions parameters with Options ensures that the remaining parameters are consistent across strategies and any differences are handled by this new logic. Outside of ExtractPlist, the only other difference is that some find_versions methods have a provided_content parameter but that's currently only used by tests (though it's intended for caching support in the future). I will be renaming that parameter to content in an upcoming PR and expanding it to the other strategies, which should make them all consistent outside of ExtractPlist.


If you'd like to see what it looks like to implement a new configuration option, here's a commit from my upcoming PR to add a user_agent option: a7598a1 (the branch is livecheck/add-user_agent-option). The initial implementation of Options required a number of changes when adding a new option but the current T::Struct setup makes things fairly reasonable.

Copy link
Member

@p-linnane p-linnane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So happy to see this PR! Thanks @samford!

Copy link
Member

@nandahkrishna nandahkrishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thank you Sam! 🎉

Copy link
Member

@dduugg dduugg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work, thanks especially for replacing all the unused args

@samford
Copy link
Member Author

samford commented Feb 13, 2025

Thanks for the review! I did some isolated benchmarking and memory profiling of the alternative approaches mentioned above and the results reminded me that I was optimizing for performance and minimizing memory allocation. The general idea is that we won't be modifying Options often (after upcoming PRs) but the code will be executed frequently, so I tend to favor implementations here with the least overhead and resource usage (even if it requires more repetition in the code). For what it's worth, I borrowed/adapted some of the method implementations from existing code in brew, so there's precedent for some of this.

Using the existing methods as the baseline, it shakes out like this (hopefully this is comprehensible):

to_h

Time Memory
Hash literal 1x 1x
instance_variables.to_h 2.34x 2x
instance_variables.each_with_object 3.11x 1.25x
T::Struct no props: serialize 1.9x 2x
T::Struct with props: serialize 2.89x 2x

to_hash

Time Memory
Hash literal 1x 1x
instance_variables.to_h 3.32x 2.75x
instance_variables.each_with_object 4.08x 2x
T::Struct no props: serialize 1.48x 1x?
T::Struct with props: serialize 2.58x 1x?

empty?

I've omitted memory here because the existing X.nil? && ... approach doesn't allocate memory. However, if we use instance_variables.all? as the baseline for memory, the T::Struct approaches involve 4x more memory allocations (same amount as the to_hash example, since they both use serialize).

Time
x.nil? && ... 1x
instance_variables.all? 1.91x
T::Struct no props: serialize.empty? 1.25x
T::Struct with props: serialize.empty? 1.89x

present?

Memory allocation for present? is the same as empty?, so the previous comment applies here as well.

Time
!x.nil? || ... 1x
instance_variables.any? 1.7x
T::Struct no props: !empty? 1.06x
T::Struct with props: !empty? 1.55x

That said, I can understand if there's a general consensus that something like code concision and required maintenance is more important than relatively small differences in performance/memory usage. Otherwise, I'm leaning towards sticking with these implementations even though it will mean that it takes maybe an extra minute to add a new option.

I'll respond to a few of the comments above with some contextual explanations.

Copy link
Member

@nandahkrishna nandahkrishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the benchmarking here, I agree that there's value in ensuring these method calls run in as little time and use as little additional memory as possible – and indeed, these are going to be run perhaps an order of magnitude more often than they are updated. (Although, there is a case to be made for livecheck being slower – that way we'll be rate-limited less often – I'm mostly joking 😆.)

I'm happy to approve this PR as-is and see it merged, but I guess there's no harm in waiting to see if others have comments or disagree with the reasoning here.

Copy link
Member

@dduugg dduugg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't dismiss the prior approval, but I feel this code could be a lot more amenable to future maintainers. It should use actual livecheck benchmarks with solid (> 1%) improvements before introducing optimizations, rather than trying to shave an allocation here and there. (Just my 2¢, I'm still very happy with your work here.)

@samford
Copy link
Member Author

samford commented Feb 13, 2025

I won't dismiss the prior approval, but I feel this code could be a lot more amenable to future maintainers. It should use actual livecheck benchmarks with solid (> 1%) improvements before introducing optimizations, rather than trying to shave an allocation here and there. (Just my 2¢, I'm still very happy with your work here.)

I may not have made it clear in my previous comments but I think your position's entirely reasonable and I can get behind it. From the beginning, I was unsure whether the maintenance burden was going to be worth it for some small optimizations, so I appreciate that you've been willing to advocate for what you see as a better solution.

Though there's a part of me that always wants to minimize resource usage (when possible/reasonable), I understand that the technical gains are less significant than the added maintenance burden. I've acknowledged in the past that the latency involved with even one network request completely dwarfs optimizations like this (as you mentioned) but I always appreciate a reminder to put things in perspective. [Regarding maintenance burden, I initially overlooked one or two areas while working on the user_agent option, so those aren't just idle concerns.]

I've had similar conversations in other PRs over the years, so I definitely wanted to make sure that there was plenty of time/space for healthy discussion here (i.e., I have no intention of merging this until we're all on the same page and feel good about it). Based on past discussions, I imagine that Mike would share your viewpoint (favoring something more maintainable like the T::Struct approach), so I'm glad we discussed this.

I'll push some changes later tonight to have Options subclass T::Struct and incorporate some of the related suggestions, as I agree that those changes make it easier to maintain. Thanks again for reviewing/discussing ❤️

@samford
Copy link
Member Author

samford commented Feb 14, 2025

The latest push reworks Options to subclass T::Struct and makes some tweaks to address typed: strong errors (as I know that's something we'll be working towards eventually). I was having trouble addressing the type errors in #== yesterday but I was able to come up with a solution today, so this now uses typed: strong.

I think this is all that I wanted to take care of in this push, so it should be ready for another round of review.

@samford samford force-pushed the livecheck/options branch 3 times, most recently from 82e2c6d to 01d6c68 Compare February 14, 2025 19:06
@samford
Copy link
Member Author

samford commented Feb 17, 2025

This push:

  • Updates Options#merge to use other.empty? (instead of other.blank?)
  • Adds Options#merge! and updates the @options.merge call in the livecheck url DSL method
  • Adds language to the main Options comment to explain how nil values are used to indicate that a value hasn't been set

Copy link
Member

@MikeMcQuaid MikeMcQuaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks @samford! A few comments, none blocking.

@samford samford force-pushed the livecheck/options branch 2 times, most recently from af1bf88 to b01ef70 Compare February 24, 2025 18:03
samford and others added 5 commits February 25, 2025 10:56
This adds a `Livecheck::Options` class, which is intended to house
various configuration options that are set in `livecheck` blocks,
conditionally set by livecheck at runtime, etc. The general idea is
that when we add features involving configurations options (e.g., for
livecheck, strategies, curl, etc.), we can make changes to `Options`
without needing to modify parameters for strategy `find_versions`
methods, `Strategy` methods like `page_headers` and `page_content`,
etc. This is something that I've been trying to improve over the years
and `Options` should help to reduce maintenance overhead in this area
while also strengthening type signatures.

`Options` replaces the existing `homebrew_curl` option (which related
strategies pass to `Strategy` methods and on to `curl_args`) and the
new `url_options` (which contains `post_form` or `post_json` values
that are used to make `POST` requests). I recently added `url_options`
as a temporary way of enabling `POST` support without `Options` but
this restores the original `Options`-based implementation.

Along the way, I added a `homebrew_curl` parameter to the `url` DSL
method, allowing us to set an explicit value in `livecheck` blocks.
This is something that we've needed in some cases but I also intend
to replace implicit/inferred `homebrew_curl` usage with explicit
values in `livecheck` blocks once this is available for use. My
intention is to eventually remove the implicit behavior and only rely
on explicit values. That will align with how `homebrew_curl` options
work for other URLs and makes the behavior clear just from looking at
the `livecheck` block.

Lastly, this removes the `unused` rest parameter from `find_versions`
methods. I originally added `unused` as a way of handling parameters
that some `find_versions` methods have but others don't (e.g., `cask`
in `ExtractPlist`), as this allowed us to pass various arguments to
`find_versions` methods without worrying about whether a particular
parameter is available. This isn't an ideal solution and I originally
wanted to handle this situation by only passing expected arguments to
`find_versions` methods but there was a technical issue standing in
the way. I recently found an answer to the issue, so this also
replaces the existing `ExtractPlist` special case with generic logic
that checks the parameters for a strategy's `find_versions` method
and only passes expected arguments.

Replacing the aforementioned `find_versions` parameters with `Options`
ensures that the remaining parameters are fairly consistent across
strategies and any differences are handled by the aforementioned
logic. Outside of `ExtractPlist`, the only other difference is that
some `find_versions` methods have a `provided_content` parameter but
that's currently only used by tests (though it's intended for caching
support in the future). I will be renaming that parameter to `content`
in an upcoming PR and expanding it to the other strategies, which
should make them all consistent outside of `ExtractPlist`.
As suggested, this reworks `Options` to subclass `T::Struct`, which
simplifies the implementation and makes it easier to maintain.

One noteworthy difference in switching to `T::Struct` is that
`#serialize` omits `nil` values but I don't _think_ this should be a
problem for us. In terms of changes, I modified `#merge` to remove a
now-unnecessary `compact` call and updated related tests.

Co-authored-by: Douglas Eichelberger <697964+dduugg@users.noreply.github.com>
This enables `typed: strong` in `Options` and resolves the related
errors.

I annotated the return value of `serialize` in `#to_hash` to resolve
a type error and then updated other methods to use `to_hash` instead.
This approach resolves similar type errors without duplicating the
same `serialize` annotation.

For `#==`, I switched `instance_of?(other.class)` to
`other.is_a?(Options)`, as Sorbet understands that `is_a?` ensures
`other` is an `Options` object but doesn't seem to understand
that `instance_of?` was doing the same thing. The tests continue to
pass with these changes, so hopefully this is fine.
@samford
Copy link
Member Author

samford commented Feb 25, 2025

I rebased this and resolved the conflicts from the recent #demodulize changes. I'm going to go ahead and merge this, so I can create PRs for additional work that's based on this branch. Thanks everyone for all the review and discussion!

@samford samford added this pull request to the merge queue Feb 25, 2025
Merged via the queue into master with commit 6291c24 Feb 25, 2025
39 checks passed
@samford samford deleted the livecheck/options branch February 25, 2025 17:04
@EricFromCanada
Copy link
Member

Nice work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants