-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: fix fetching updated targets from TUF root #1921
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1921 +/- ##
==========================================
- Coverage 34.01% 34.00% -0.02%
==========================================
Files 153 153
Lines 9977 9981 +4
==========================================
Hits 3394 3394
- Misses 6202 6208 +6
+ Partials 381 379 -2
Continue to review full report at Codecov.
|
447e794
to
39cf2af
Compare
Signed-off-by: Asra Ali <asraa@google.com> add comment Signed-off-by: Asra Ali <asraa@google.com> update Signed-off-by: Asra Ali <asraa@google.com> update Signed-off-by: Asra Ali <asraa@google.com> possible fix windows Signed-off-by: Asra Ali <asraa@google.com> lint Signed-off-by: Asra Ali <asraa@google.com> fix windows maybe Signed-off-by: Asra Ali <asraa@google.com> fix close Signed-off-by: Asra Ali <asraa@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure you saw this, but test failure/lint failure looks related.
Also, your PR description is great---can you copy a bunch of it inline?
Release note is a little inappropriate for end users: I think "TUF client' is an implementation detail.
pkg/cosign/tuf/client.go
Outdated
Set(string, []byte) error | ||
type localInitFunc func() (client.LocalStore, error) | ||
|
||
type embeddedLocalStore struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming nitpick: embeddedLocalStore
makes me think that it's only an embedded local store.
Above all applies to embeddedWrapper
too.
} | ||
|
||
type memoryCache struct { | ||
targets map[string][]byte | ||
var GetEmbedded = func() fs.FS { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we call GetEmbedded each time, rather than locking it in at init time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then if we're going to do that, we may as well make it a parameter of the init method rather than something that needs to be mocked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmmm I don't want to lock it in because tests need to modify that on an individual basis
Otherwise I would need to pre-hook test runs, I think. But correct me if I'm wrong about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err, doesn't each test instantiate its own TUF client (via NewFromEnv
)? You should be able to move the GetEmbedded
call into initializeTUF
or wrapEmbedded
/wrapEmbeddedLocal
just the first time. I also don't think it's that bad to have a separate constructor that doesn't access any environment variables that you could explicitly pass the fake FS into for tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! I misunderstood what you meant by "init time" -- I thought that meant Golang "init" which happens once globally.
I like the initialize
idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd rather pass this in to the constructor explicitly
pkg/cosign/tuf/client.go
Outdated
return nil, errors.New("fs.ReadFileFS unimplemented for embedded repo") | ||
} | ||
for _, entry := range entries { | ||
if entry.Type().IsRegular() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does IsRegular
matter? Is this just skipping
I might prefer
for _, entry := range entries {
if !entry.Type().IsRegular() {
continue // skip bad files <-- but be more descriptive
}
// ...
}
as it emphasizes the skipping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this is the metadata store, I need to avoid copying over the target directory -- that would have ModeDir
set. Also just in case there's executables or something funky. I like your skipping paradigm!
Thanks @znewman01 for the really helpful comments! Addressed most of them I think, closing out trivial ones -- still trying to figure out the failures with the policy... |
6794e51
to
dccbdeb
Compare
Signed-off-by: Asra Ali <asraa@google.com> update fix Signed-off-by: Asra Ali <asraa@google.com> update and add some debug Signed-off-by: Asra Ali <asraa@google.com> add debug Signed-off-by: Asra Ali <asraa@google.com> no cache Signed-off-by: Asra Ali <asraa@google.com> remove debug Signed-off-by: Asra Ali <asraa@google.com>
b16f38d
to
9b337a5
Compare
@@ -44,6 +46,10 @@ const ( | |||
SigstoreNoCache = "SIGSTORE_NO_CACHE" | |||
) | |||
|
|||
var GetRemoteRoot = func() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be publicly exported, or can it be a private variable in the package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of want to make this a public global flag eventually! That would make it easy to point to preprod/staging environment. I can make it a follow up issue if you think so
As long as it's a func for now I'm fine, because I need to configure it in the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding this to the tracking issue explicitly #1935
pkg/cosign/tuf/client.go
Outdated
func (m *memoryCache) Set(p string, b []byte) error { | ||
if m.targets == nil { | ||
m.targets = map[string][]byte{} | ||
func (e embeddedLocalStore) GetMeta() (map[string]json.RawMessage, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: How do we use the embedded metadata (as in, what's shipped with Cosign)? Isn't it only to start the chain of TUF roots?
Would it be reasonable to only store root.json, and remove the rest? Or is go-tuf expecting a complete repository? (I thought there was support now for initializing only from a root)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just wondering if we can simplify the logic around embedded wrapping. We have this tri-state - Either it's in in the embedded, or it's in either memory or on disk. If we only access a file or two from embedded, just for initialization, we could potentially separate embedded from LocalStore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go-tuf is not expecting a complete repository, but we don't want to touch the remote unless we need an update so we want the complete repository.
it's to act as a starting local state, i guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timestamp will almost always be expired, so won't we always need to hit the remote?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
almost always yeah, fwiw i'm willing to reconsider this in general and simplify our embedded repo state to only contain the root.json because that would align a lot more closely with normal go-tuf flows and avoid all this complex overlayed logic on whether or not we need to hit the remote...
but yeah the original idea was that if the state of the embedded repo was good we wouldn't fetch. at that point the timestamp had a fairly long expiration -- 3 weeks or so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we're likely going to be moving to more frequent updates, I'm not sure the embedded non-root metadata adds much value. Do you think it would simply the logic here? Would it remove the embeddedWrappers completely?
If you want to save a bit of remote fetching, is it possible to initialize just the LocalStore containing the targets? Then what's embedded what just be the root and targets. When we fetch the new metadata and compare target files hashes, go-tuf would find that all are locally downloaded. But this is only saving a single initial fetch, so that might not be worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this solution! Only the root is in the embedded, but we also embed targets and use them ONLY IF their hashes are equal to what we see in the targets.json
.
Signed-off-by: Asra Ali <asraa@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think once we backport the test fixes from #1932 this is pretty close
} | ||
|
||
type memoryCache struct { | ||
targets map[string][]byte | ||
var GetEmbedded = func() fs.FS { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd rather pass this in to the constructor explicitly
// Unfortunately go:embed appears to somehow replace our line endings on windows, we need to switch them back. | ||
// It should theoretically be safe to do this everywhere - but the files only seem to get mutated on Windows so | ||
// let's only change them back there. | ||
if runtime.GOOS == "windows" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What hapepned to this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added!
Signed-off-by: Ville Aikas <vaikas@chainguard.dev>
Signed-off-by: Ville Aikas <vaikas@chainguard.dev>
Signed-off-by: Ville Aikas <vaikas@chainguard.dev>
Signed-off-by: Ville Aikas <vaikas@chainguard.dev>
Signed-off-by: Ville Aikas <vaikas@chainguard.dev>
Signed-off-by: Ville Aikas <vaikas@chainguard.dev>
Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
done, but also added this to the tracking issue for cleanup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
@@ -44,12 +46,21 @@ const ( | |||
SigstoreNoCache = "SIGSTORE_NO_CACHE" | |||
) | |||
|
|||
// Global in-memory targets to avoid re-downloading when there is no local cache. | |||
// TODO: Consider using this map even when local caching to avoid reading from disk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like this idea as a future improvement! on initialize, just load in all targets into memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicitly adding in #1935
return nil, err | ||
} | ||
t.targets = newFileImpl() | ||
t.local, err = newLocalStore() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify - On the first initialization, newLocalStore() will set up a local TUF DB, but without any metadata or targets in it, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, the first place where the local is populated is with the InitLocal
method, and that sets the trusted root metadata.
cosign/pkg/cosign/tuf/client.go
Lines 264 to 275 in e7bcb69
if root == nil { | |
root, err = getRoot(trustedMeta, t.embedded) | |
if err != nil { | |
t.Close() | |
return nil, fmt.Errorf("getting trusted root: %w", err) | |
} | |
} | |
if err := t.client.InitLocal(root); err != nil { | |
t.Close() | |
return nil, fmt.Errorf("unable to initialize client, local cache may be corrupt: %w", err) | |
} |
go-tuf client calls populate the timestamp/snapshot/targets on Update()
} | ||
if err := os.WriteFile(cachedRemote(rootCacheDir()), b, 0600); err != nil { | ||
return fmt.Errorf("storing remote: %w", err) | ||
if err := util.TargetFileMetaEqual(localMeta, validMeta); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To also double check - This would catch if the embedded targets are modified for some reason, correct? So to verify the e2e flow:
- Cosign requests the target metadata (targets.json) from the remote
- The metadata's sig is verified using the target role from the root (which has already been updated and verified)
- For each target in targets.json, we check if the target needs to be downloaded or is present in the embedded/local cache
- For each target, we compare its hash to the hash from the metadata (this step)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Mostly correct, except slight details in Target retrieval/download from
cosign/pkg/cosign/tuf/client.go
Line 464 in e7bcb69
func maybeDownloadRemoteTarget(name string, meta data.TargetFileMeta, t *TUF) error { |
- Cosign requests the target metadata (targets.json) from the remote
- The metadata's sig is verified using the target role from the root (which has already been updated and verified)
- For each target in targets.json, we check (1) if we already have the target in our local cache, if it's hash is valid, do nothing. (2) if the target is present in the embedded/local cache, we compare its hash from the metadata and copy it to the cache for ease of retrieval or (3) it must need to be downloaded (the hash is verified in the go-tuf call) and we copy it to the cache
- On get, we retrieve the target compare its hash to the hash from the metadata (this step)
dest := targetDestination{buf: &w} | ||
if err := t.client.Download(name, &dest); err != nil { | ||
return fmt.Errorf("downloading target: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to also check isValidTarget
, or will Download
handle checking the target hash? (If Download doesn't, it should!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, it does check hash handling!
https://github.com/theupdateframework/go-tuf/blob/3b26aedfe985198bc88a9dda7525938c575ca046/client/client.go#L928-L934
I also put hash checking on the Get
/retrieval side, just for robustness:
cosign/pkg/cosign/tuf/client.go
Lines 343 to 354 in e7bcb69
validMeta, err := t.client.Target(name) | |
if err != nil { | |
return nil, fmt.Errorf("error verifying local metadata; local cache may be corrupt: %w", err) | |
} | |
targetBytes, err := t.targets.Get(name) | |
if err != nil { | |
return nil, err | |
} | |
if !isValidTarget(targetBytes, validMeta) { | |
return nil, fmt.Errorf("cache contains invalid target; local cache may be corrupt") | |
} |
Yay! Thanks everyone! |
@dlorenc please can I have a write access review :) |
Signed-off-by: Asra Ali asraa@google.com
Summary
This change refactors the cosign TUF client and hopefully aims to simplify the logic behind embedded TUF repository and targets, and the writeable on-disk/in-memory repository and targets. Roughly, I structured this so that the cosign TUF client contains (1) A
client.LocalStore
to hold TUF repository metadata updates and (2) AtargetImpl
to hold downloaded and cached target files.Previously, the two were "out of sync" -- when starting with an embedded workflow, we would create an embedded root repository and an embedded targetImpl. However, the embedded targetImpl ONLY retrieved embedded targets, not the updated ones!
cosign/pkg/cosign/tuf/client.go
Line 548 in 5f09c42
Embedded workflows never used their updated targets, despite writing them to the underlying storage, and failed to retrieve new targets referenced by updated metadata.
Conceptually now, there are 4 main objects in the TUF client:
memoryCache
targetImpl: A map that stores target files by name.diskCache
targetImpl: Stores targets on disk.embeddedWrapper
targetImpl: This wraps the underlying (memory, disk) cache. By default itGet
s embedded targets. If any targets were downloaded andSet
, thenGet
transfers to retrieving from the underlying cache.embeddedLocalStore
that wraps either aMemoryLocalStore
orFileLocalStore
. Similar to above, by default it gets embedded repo metadata, until any new metadata needed to be downloaded and set. Then anyGetMeta
operations get the cached metadata.Other:
brokenv3
's GCS bucket and it successfully found the new targetfulcio_interemediate_v1.crt.pem
If this design looks good, I will continue to add testing and clean-up test code. It's a pain to manually write out TUF updates, so I'd like to unify those functions and also add testing for consistent snapshotting (note brokenv3 enabled that so I'm 90% sure that works with this fix).
Ticket Link
Fixes #1899
Release Note
cc @haydentherapper @dlorenc @znewman01