Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ngclient: experiment with separating local target caching from updater #1412

Closed
jku opened this issue May 21, 2021 · 4 comments
Closed

ngclient: experiment with separating local target caching from updater #1412

jku opened this issue May 21, 2021 · 4 comments
Labels
backlog Issues to address with priority for current development goals ngclient

Comments

@jku
Copy link
Member

jku commented May 21, 2021

This comes from #1317

I'm seriously considering if Updater on its own should not support local target caching at all -- it's not part of the spec, it has these sharp corners WRT urls being used as file paths, and I think it could be handled as well by an external component like a derived class (that we may or may not want to provide).

People, me included, keep getting confused about what targetpaths are: it should be clear that they are just url-fragments that serve two purposes:

  • work as identifiers for the target that can also be used to resolve delegations
  • form the final download URL for the target

but currently the client implementation uses targetpath also as a local filesystem path. Spec does not demand using the path (it does sort of say files should be saved with the same filename that was used in URL however -- this is likely spec overreach). Separating the caching so that a spec compliant updater can be implemented without all the filesystem path issues would make it clearer what targetpaths are.


Non-caching updater could look like

class Updater:
    def __init__(...)
    def refresh()

    def lookup(name:str ) -> TargetFile
    # here caller can use TargetFile.hashes_match(file_obj) if they want to check a local file
    def download(target: TargetFile, local_path: str) # write the file to given path

The CachingUpdater would add:

class CacheUpdater(Updater)
    def __init__(..., cache_dir:str)
    def get_cached(target: TargetFile) -> Optional[str] # returns local path if it was cached
    def download(target: TargetFile) -> str # writes the file to cache (if needed), returns local path
    
    # possibly should still offer a way to write to given path like Updater.download() (and still cache?)
@jku jku added the ngclient label May 21, 2021
@sechkova sechkova added the backlog Issues to address with priority for current development goals label May 26, 2021
@jku
Copy link
Member Author

jku commented Aug 26, 2021

I'll just document this other idea wrt client cache here:

Using unrestricted external input as filenames is a bad idea (even if the input is from signed metadata), but the updater does this with targetpaths as local filepaths: Luckily there is nothing requiring client target cache to use targetpaths as local filesystem paths:

  • target files could be saved in a flat directory with hash(targetpath) as the file name
  • files would be slightly harder for humans to find but I'm not sure that is important: the cache is an implementation detail of the caching component
  • a potential downside is that some applications might not be able to use the file directly from CacheUpdater cache (because it's not named as they expect) -- but then question is, should the file be copied to somewhere else first anyway?

@jku
Copy link
Member Author

jku commented Aug 26, 2021

One feature we have not carried over from old updater is remove_obsolete_targets(). I think to provide similar functionality we'd need some very different implementation... I'm not convinced we should implement it but there's a few potential solutions I can think of:

  • Cache could keep note of the targetpaths of the targetfiles in the cache: so it could, when needed, check if all targetpaths in cache are up-to-date or not. This is where not using targetpaths as local paths is not great -- but as said, I don't think it's safe to do so
  • Alternatively, CacheUpdater could have a clean() method to load all of the delegated targets in local metadata (if they are still in delegation tree), updating them to current versions, and then checking each TargetFile vs files in cache
  • Alernatively, CacheUpdater could, while it's loading any delegated targets for other reasons, do the cache-obsolete check on all Targetfiles in that delegated Targets

All that said, the functionality sounds very application specific and I doubt we should spend time on this for ngclient itself

@jku
Copy link
Member Author

jku commented Sep 13, 2021

Ok, I've played with this quite a bit and have changed my mind a little:

  • Simplifying the Updater API and the targets local file path logic (ngclient: downloading targets fails if targetpath includes subdirectories #1571) leads to a situation where the "cache" code is really simple: there's no reason to move it out of the Updater
  • Since we are changing API wrt legacy updater, I think we should maybe fix the small annoyances as well: porting old code is typically easy as Updater isn't used in many places in client code, and our code is still structured in a similar way.

Possibly it makes sense to close this issue and continue a in a new one as the title and description are not descriptive but I will lay out my current experiment here:

  • Don't use target path directories as local path: this is unsafe and not very useful
  • shorten the too long get_one_valid_targetinfo() name
  • make the local and remote target lookups have (almost) identical signature (this replaces old updated_targets() and download_target())
  • make refresh() call optional and implicit

This leads to minimal complete example:

    updater = Updater(local_metadata, metadata_url, targets_url)
    info = updater.get_targetinfo("file.txt")
    path = updater.get_local_target(info, "~/downloads/")
    if path is None:
        path = updater.download_target(info, "~/downloads/")

I'm not sure if the method names are the best but I think the arguments and return values make a lot of sense, and make the API easier to understand. One change I might still make is the "negative result" in get_local_target(): if it raised instead of returning None, the local and remote target getters would have practically identical signatures...

@sechkova
Copy link
Contributor

Closing this as the work continues in #1580.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog Issues to address with priority for current development goals ngclient
Projects
None yet
Development

No branches or pull requests

2 participants