-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Fixing extra download for filesystem's localization #254
base: main
Are you sure you want to change the base?
Conversation
src/filesystem/api.h
Outdated
/// Create an object representing a local copy of a path. | ||
/// \param path The path of the directory or file. | ||
/// \param fetch_subdir If specified, will only download provided | ||
/// sub directory, otherwise all subdirectories will be downloaded. | ||
/// Does not affect files individual files, located under `path`. | ||
/// \param localized Returns the LocalizedPath object | ||
/// representing the local copy of the path. | ||
/// \return Error status | ||
Status LocalizePath( | ||
const std::string& path, std::shared_ptr<LocalizedPath>* localized); | ||
const std::string& path, const std::string& fetch_subdir, | ||
std::shared_ptr<LocalizedPath>* localized); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the interface is insufficiently generic for functions at the level of LocalizePath()
, because LocalizePath()
should be available to any callers who want to localize any path on the cloud, but fetch_subdir
imposes a limitation on the caller to either fetch all sub-directories within the path or only the sub-directory specified. The caller may not "choose" which sub-directories to fetch or not fetch. let alone choosing sub-directories within a sub-directory.
I think there are many different generic design choices. For example:
- Use
bool recursive
instead ofstring fetch_subdir
. The caller can choose not fetching any sub-directories withrecursive = false
or all sub-directories withrecursive = true
. Then, the problem can be accomplished by first fetching the given model path withrecursive = false
(i.e./repo/model
) and then the model path with version withrecursive = true
(i.e./repo/model/1
). However, this will require theLocalizedPath
object to be able to join the model path and model version path localization objects into one localization object. - Allow the caller to specify a list of sub-paths to localize (or not localize). The "complicated" structure of sub-paths allows for fetching different combinations of sub-directories in a single call, which eliminates the need for joining model path and model version path localization objects, but the "complicated" structure is less intuitive for the caller to use and may not be easier to implement than the first design choice.
However, the interface accomplishes exactly what we want to solve at this time. I will defer to @GuanLuo and @nnshah1 to decide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, this will require the LocalizedPath object to be able to join the model path and model version path localization objects into one localization object.
Do we? If I understand correctly, versioned path is set here:
Lines 113 to 114 in a23786c
const auto version_path = | |
JoinPath({localized_model_path, std::to_string(version)}); |
so instead of
const auto version_path =
JoinPath({localized_model_path, std::to_string(version)});
we would do something like:
std::shared_ptr<LocalizedPath> version_path;
RETURN_IF_ERROR(LocalizePath(path/to/version, recursive=false, &version_path));
or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking something like:
std::string model_path = "/repo/model_dir";
std::string model_version_path = "/repo/model_dir/1";
std::shared_ptr<LocalizedPath> model_localized;
std::shared_ptr<LocalizedPath> model_version_localized;
LocalizePath(model_path, recursive=false, &model_localized);
LocalizePath(model_version_path, recursive=true, &model_version_localized);
model_localized->Include(model_version_localized);
The idea behind LocalizedPath::Include()
is to join
model_tmp
|
`--- config.pbtxt
and
model_version_tmp
|
`--- model.py
to
model_tmp
|
`--- config.pbtxt
|
`--- 1
|
`--- model.py
on the local disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually am missing something. Even though I like option 1 suggestion, localization in this case will happen to another directory. Meaning, at current stage every call to LocalizePath
creates a new temporary directory. Thus if we separately will try to load model and its version, they will be copied in 2 different directories. So unless we want to change that, option 1 is not the best option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be ok to only implement the base case above, and throw an error if the caller is trying to join/include anything more complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on offline discussion, with @kthui, we can add one more parameter location
to control where to localize provided path, i.e. something like
LocalizePath(model_version_path, recursive-true/false, location=<string>, &model_version_localized);
Then we can control where version directory will be localized , and if location
is empty, we create a new, temporary directory.
This will also help with the follow up ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meaning, at current stage every call to LocalizePath creates a new temporary directory. Thus if we separately will try to load model and its version, they will be copied in 2 different directories. So unless we want to change that, option 1 is not the best option.
I am thinking on the same direction as LocalizedPath::Include()
that if we want to have a generic way to localizing directories (with customization on fetching subset of contents / recurrsiveness), we want to make LocalizedPath
a "filler" class: as on calling fs::LocalizePath
, it returns an fully localized object (current) or empty object (with local root path assigned). Then if user want to customize the localization, they can call:
LocalizedPath::Include("sub/path/content", recursive);
Although I think it then the process is convoluted that there is a iterative calls of GetDirectorySubdirs/Files
and LocalizedPath::Include
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is really an internal API (unless we want to expose file system API in the future) so we are free to modify the interface as frequent as we want and don't need to worry about being generic. If localizing model directory is somewhat specific, we may just introduce a LocalizeModelDirectory
method to fit the specific need.
Because when you look at what to be localized for a model version, it is actually "everything in the directory except for the unwanted version subdirectories (integer-named directories)", that is tricky to express it with some generic language.
@kthui, @GuanLuo and @nnshah1. It is ready for re-review. I didn't implement
properly considered. As of now, I think specifying Change in Regarding environmental variable. Current name is Finally, current requirements for mount directory:
Current look: bucket is mounted to
|
{ | ||
std::shared_ptr<FileSystem> fs; | ||
RETURN_IF_ERROR(fsm_.GetFileSystem(path, fs)); | ||
return fs->LocalizePath(path, localized); | ||
return fs->LocalizePath(path, recursive, "", localized); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this (L565-L567) can potentially be shortened to
retrurn LocalizePath(path, recursive, "", localized)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments
@@ -363,7 +364,8 @@ GCSFileSystem::ReadTextFile(const std::string& path, std::string* contents) | |||
|
|||
Status | |||
GCSFileSystem::LocalizePath( | |||
const std::string& path, std::shared_ptr<LocalizedPath>* localized) | |||
const std::string& path, const bool recursive, const std::string& mount_dir, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you going to implement the new API in other cloud file system in this PR as well? If not, should return error if !mount_dir.empty()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I wanted to first get an agreement on the implementation for s3. I'll populate it to GCS and AS in the next commit for easier review
// Only allow the error due to parent directory does not exist | ||
// if 'recursive' is requested | ||
// Return success if directory already exists | ||
if (errno == EEXIST) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be a function argument on whether file exist
is allowed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I'll add it
src/filesystem/api.h
Outdated
/// \param path The path of the directory or file. | ||
/// \param recursive If true, will fetch all sub-directories in | ||
/// the provided path. | ||
/// \param mount_dir If specified, will use provided local directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean "If not empty" rather than "If specified"? I do not think the caller can skip this parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll clarify, thank you for suggestion
LocalizePath
LocalizePath
src/filesystem/implementations/as.h
Outdated
if (recursive) { | ||
for (const auto& directory_item : blob_prefixes) { | ||
const auto& local_path = JoinPath({dest, BaseName(directory_item)}); | ||
int status = mkdir( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the FileSystem
wrapped API be used here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, for Windows. good catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
tmp_folder = mount_dir.empty() ? std::string(env_mount_dir) : mount_dir; | ||
tmp_folder = | ||
JoinPath({tmp_folder, path.substr(path.find_last_of('/') + 1)}); | ||
RETURN_IF_ERROR(triton::core::MakeDirectory( | ||
tmp_folder, true /*recursive*/, true /*allow_dir_exist*/)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oandreeva-nv just checking my understanding, I see your comment saying:
Finally, current requirements for mount directory:
- Should exist before
But in the code I'm seeing calls to MakeDirectory()
, which I'm inferring would create the directory if it doesn't exist.
So to clarify, is the mount directory required to exist before server startup? Or will Triton create it on demand, similar to tmp folder workflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will clarify this in a follow-up commit. This PR would need more de-bugging and testing
const char* env_mount_dir = std::getenv("TRITON_AZURE_MOUNT_DIRECTORY"); | ||
std::string tmp_folder; | ||
if (mount_dir.empty() && env_mount_dir == nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's already a follow-up ticket, I'd like to see use of some helper function like this one that safely checks if the env var is set or is nullptr, and just returns a std::string
. This helper could probably just take an env var name as an argument to condense the usage.
Currently this if/else is OK, but if the conditions were to be tweaked in the future, there is some risk for std::string(env_mount_dir)
when it is a nullptr
.
LocalizePath
To avoid unnecessary copy during localization, I added
fetch_subdir
parameter toLocalizePath
: If specified, will only download provided sub directory, otherwise all subdirectories will be downloaded. Does not affect files individual files, located underpath
.[WIP] tag due to the following. When we agree on the proposed solution, I will extend it to GCS and AS.
What was seen before:
With this fix: