Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add clarifying comments around TUF caching options #173

Merged
merged 1 commit into from
May 22, 2024

Conversation

haydentherapper
Copy link
Contributor

@haydentherapper haydentherapper commented May 15, 2024

This changes the behavior of the TUF client to use the cached metadata if it's unexpired on first initialization. Previously, if no config which included the last update timestamp was found, it would force a reinitialization. Now, the first init with cached metadata will be offline.

Fixes #162

Summary

Release Note

Documentation

@haydentherapper
Copy link
Contributor Author

cc @ramonpetgrave64

@haydentherapper
Copy link
Contributor Author

@kommendorkapten, lemme know your thoughts on this. I agree with your comment that a good solution for offline verification is to pass a trust root file out of band, which Ramon and I discussed, but I think the behavior in this PR is a nice default so that users do not need to be aware of the timestamp config.

// Config will not exist during the first initialization
// of unexpired metadata
cfg = &Config{
LastTimestamp: time.Now(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tried yet, but does this mean that the client will assume that my local cache is already fresh, without first checking ~/.sigstore/root/tuf-repo-cdn.sigstore.dev.json?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the cache is already fresh, meaning c.up.Refresh() succeeded, and no configuration file exists, then no additional refresh will occur. This addition persists the current time so that on the next metadata load, the config will be checked.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By fresh, I meant that my local copy up-to-date before creating the client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c.up.Refresh when UnsafeLocalRefresh is set does not make any network calls, it simply loads the metadata from disk. It will err out if the timestamp is out of date, that we cannot and should not bypass.

Does that answer your question?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those options seem like the right thing I could use. Would you make those publicly accessible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should expose UnsafeLocalRefresh, but instead make sure we have caching options that match expected use cases. Which I think we do have currently, besides what's in this PR.

The more I look at this, I think that ForceCache is what most users should be using, and that CacheValidity doesn't have a clear use case.

@kommendorkapten
Copy link
Member

I don't see any real issues with this PR (but I would prefer to only write the timestamp file upon sigstore client initiated tuf refresh), but I do think that the cache is being misused if there is a concern about one egress TUF update on the first invocation.

The cache is not about prohibiting network calls to be made, as the timestamp can expire at any time. It's only about avoiding excessive tuf updates if a program is called within a "tight loop".

Consider the scenario where the tuf cache is populated by some process at time X_1, timestamp expires at X_2 and the the sigstore client is started at X_3. This will force a full TUF update as timetsamp is expired, so trying to use the cache to avoid network calls is doomed to fail.

@ramonpetgrave64
Copy link

but I do think that the cache is being misused if there is a concern about one egress TUF update on the first invocation.

That's understandable. For the use-cases I'm imagining, I would be using a separate process to provide fresh caches.

@haydentherapper
Copy link
Contributor Author

The cache is not about prohibiting network calls to be made, as the timestamp can expire at any time. It's only about avoiding excessive tuf updates if a program is called within a "tight loop".

If I set cache validity, to me, I want to continue to use TUF metadata up until the cache is expired as per the validity timestamp. ForceCache is the extreme of that, which is "trust the metadata until it's expired". Currently, if I set that flag without a locally saved timestamp, then it works as intended. If I set CacheValidity, it doesn't, as it will fetch TUF metadata on first client initialization.

So either:

  • We should go through with this PR to reconcile the above, or
  • Update documentation on what CacheValidity's purpose is

@@ -105,9 +105,14 @@ func (c *Client) loadMetadata() error {
} else if c.opts.CacheValidity > 0 {
cfg, err := LoadConfig(c.configPath())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this with a fresh local cache (in a previous invocation, I let the tuf client do and update, like normally). But in the next invocation, LoadConfig does not error, like I think your comment is suggesting. Printed, It actually has the value &{0001-01-01 00:00:00 +0000 UTC}.

@kommendorkapten
Copy link
Member

If I set cache validity, to me, I want to continue to use TUF metadata up until the cache is expired as per the validity timestamp

I agree, and I argue it does except for the first time as we have no idea on how old the cache actually is. The concern I have here, is that if there is an existing TUF cache, we don't know how old it is. We read it, and if it's valid sets the lastTimestamp as if it was updated at this point in time, which may not be the case. That's why it current performs a TUF refresh before writing this timestamp. Looking at the extreme scenario here (metadata cache is written a long time before the sigstore tuf client is invoked), if an existing cache is present, we may use the cached metadata that is expired given the configured cache validity (assuming metadata is not expired) with the proposed change.

Clarifying this in the documentation is what I would prefer.

@steiza
Copy link
Member

steiza commented May 20, 2024

There is a lot going on in this thread! I agree it isn't exactly clear who should own what, and how to do things like offline verification.

What we decided for the gh CLI is that if you specify a trusted_root.json file we won't instantiate TUF clients and thus won't do network requests to support offline verification.

That pull request also sets the TUF cache to 1 day to reduce (but not eliminate) TUF client network calls.

@haydentherapper
Copy link
Contributor Author

haydentherapper commented May 21, 2024

Thanks all for the comments. I think the behavior of the initial network call might be a little unexpected still, but I recognize there isn't really a perfect way to do this - either we just implicitly cache like in this PR or we fetch the latest state.

Tomorrow, I'll update this PR to be comment updates that specify a) the purpose of CacheValidity, b) when to use it vs ForceCache, c) the preferred way to do offline verification by providing a trusted_root file.

@kommendorkapten
Copy link
Member

thanks @haydentherapper, especially around using an trusted_root file for true offline mode, as that's something that has been discussed in many places/repos and we always come to the conclusion that using TUF in those scenarios is not the best choice. Having a single place with documentation to refer to would be great!
And clarify that caching is only "best effort" and that network calls are inevitable, and if those fails (due to air-gapped environment) the program will crash.

Clarifies these are not sufficient for airgapped environments. Also
leaves in a test to demonstrate the behavior of CacheValidity.

Signed-off-by: Hayden Blauzvern <hblauzvern@google.com>
@haydentherapper
Copy link
Contributor Author

@kommendorkapten Updated the comments, and I left in the test I had added since I think it's useful to demonstrate that CacheValidity will still update a local cache without a timestamp.

@haydentherapper haydentherapper changed the title Use cached metadata when unexpired on first initialization Add clarifying comments purpose of TUF caching options May 21, 2024
@haydentherapper haydentherapper changed the title Add clarifying comments purpose of TUF caching options Add clarifying comments around TUF caching options May 21, 2024
Copy link
Member

@kommendorkapten kommendorkapten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@haydentherapper haydentherapper merged commit 19d35d4 into sigstore:main May 22, 2024
10 checks passed
@haydentherapper haydentherapper deleted the fix-ts branch May 22, 2024 05:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: fix WithCacheValidity
4 participants