Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect prefixes when indexing checkpoints that have no tags #39

Closed
bencottier opened this issue Nov 18, 2021 · 5 comments · Fixed by #40
Closed

Incorrect prefixes when indexing checkpoints that have no tags #39

bencottier opened this issue Nov 18, 2021 · 5 comments · Fixed by #40
Assignees
Labels
bug Something isn't working

Comments

@bencottier
Copy link
Contributor

If I save a checkpoint called "Forecasters.predicted" in example/path without tags, the filepath is example/path/Forecasters/predicted.jlso.

This IndexEntry constructor assumes that prefixes are those paths segments which do not contain "=", i.e. are not tags. However, when there are no tags in the first place, all path segments will be included in this operation, because first_tag_ind defaults to 1:

first_tag_ind = something(findfirst(contains("="), filepath.segments), 1)
segments = filepath.segments[first_tag_ind:end-1]
prefixes = filter(!contains("="), segments)

MWE:

julia> Checkpoints.register("Forecasters", ["predicted"])

julia> Checkpoints.config("Forecasters.predicted", "example/path")

julia> checkpoint("Forecasters.predicted", [1])
13361

julia> IndexEntry("example/path/Forecasters/predicted.jlso").prefixes
("example", "path", "Forecasters")
@bencottier
Copy link
Contributor Author

bencottier commented Nov 18, 2021

Possible solutions:

  • Document the assumption of at least 1 tag
  • Add base_path as an argument to IndexEntry so it doesn't have to guess where the base path ends
    • And maybe: assume that the path given to index_checkpoint_files is the longest base path (i.e. checkpoint stuff starts in the first level of subfolders)
  • Determine the base path as the longest common prefix of all checkpoint paths.

@bencottier bencottier added the bug Something isn't working label Nov 18, 2021
@oxinabox
Copy link
Member

Is this only a problem when IndexEntry is used directly?
If using index_checkpoint_files(dir) (or index_files) then I think it works fine?
you pass in the "base_dir" already.
You would call for example: index_checkpoint_files(p"example/path/")
and then any directories under that is either for a tag or for the prefix.

If it does work correctly for index_checkpoint_files then I think we should document that including directories that are not either part of tags or prefix in the call to the IndexEntry constructor is not defined.
I don't think we should provide a base_path argument, since it would be exactly as verbose as the user using relpath themselves

@bencottier
Copy link
Contributor Author

It is still incorrect for index_checkpoint_files and index_files, because each filepath that it passes to IndexEntry is an absolute path.

@bencottier
Copy link
Contributor Author

But maybe I see where you are going - a solution is index_checkpoint_files computes a relpath and passes that to IndexEntry. In that case I think we should document this:

assume that the path given to index_checkpoint_files is the longest base path (i.e. checkpoint stuff starts in the first level of subfolders)

@oxinabox
Copy link
Member

Ah i see, not a abolute path, but a path that does include the directory that walk was done from

julia> walkpath(p"test/util/") |> collect
3-element Vector{PosixPath}:
 p"test/util/Manifest.toml"
 p"test/util/Project.toml"
 p"test/util/generate_test_data.jl"

So we could strip that out ourself:

julia> [relpath(x, p"test/util") for x in walkpath(p"test/util/")]

3-element Vector{PosixPath}:
 p"Manifest.toml"
 p"Project.toml"
 p"generate_test_data.jl"

But doing that would mean we wouldn't have the path for purposes of actually getting the file.

So maybe we do need a base_dir argument for IndexEntry afterall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants