Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Fixing extra download for filesystem's localization #254

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
7 changes: 4 additions & 3 deletions src/backend_model.cc
Original file line number Diff line number Diff line change
Expand Up @@ -75,10 +75,11 @@ TritonModel::Create(
}

// Localize the content of the model repository corresponding to
// 'model_path'. This model holds a handle to the localized content
// so that it persists as long as the model is loaded.
// 'model_path' and model's version. This model holds a handle to
// the localized content so that it persists as long as the model is loaded.
std::shared_ptr<LocalizedPath> localized_model_dir;
RETURN_IF_ERROR(LocalizePath(model_path, &localized_model_dir));
RETURN_IF_ERROR(
LocalizePath(model_path, std::to_string(version), &localized_model_dir));

// Localize paths in backend model config
// [FIXME] Remove once a more permanent solution is implemented (DLIS-4211)
Expand Down
6 changes: 4 additions & 2 deletions src/filesystem/api.cc
Original file line number Diff line number Diff line change
Expand Up @@ -558,11 +558,13 @@ ReadTextProto(const std::string& path, google::protobuf::Message* msg)
}

Status
LocalizePath(const std::string& path, std::shared_ptr<LocalizedPath>* localized)
LocalizePath(
const std::string& path, const std::string& fetch_subdir,
std::shared_ptr<LocalizedPath>* localized)
{
std::shared_ptr<FileSystem> fs;
RETURN_IF_ERROR(fsm_.GetFileSystem(path, fs));
return fs->LocalizePath(path, localized);
return fs->LocalizePath(path, fetch_subdir, localized);
}

Status
Expand Down
6 changes: 5 additions & 1 deletion src/filesystem/api.h
Original file line number Diff line number Diff line change
Expand Up @@ -145,11 +145,15 @@ Status ReadTextFile(const std::string& path, std::string* contents);

/// Create an object representing a local copy of a path.
/// \param path The path of the directory or file.
/// \param fetch_subdir If specified, will only download provided
/// sub directory, otherwise all subdirectories will be downloaded.
/// Does not affect files individual files, located under `path`.
/// \param localized Returns the LocalizedPath object
/// representing the local copy of the path.
/// \return Error status
Status LocalizePath(
const std::string& path, std::shared_ptr<LocalizedPath>* localized);
const std::string& path, const std::string& fetch_subdir,
std::shared_ptr<LocalizedPath>* localized);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the interface is insufficiently generic for functions at the level of LocalizePath(), because LocalizePath() should be available to any callers who want to localize any path on the cloud, but fetch_subdir imposes a limitation on the caller to either fetch all sub-directories within the path or only the sub-directory specified. The caller may not "choose" which sub-directories to fetch or not fetch. let alone choosing sub-directories within a sub-directory.

I think there are many different generic design choices. For example:

  1. Use bool recursive instead of string fetch_subdir. The caller can choose not fetching any sub-directories with recursive = false or all sub-directories with recursive = true. Then, the problem can be accomplished by first fetching the given model path with recursive = false (i.e. /repo/model) and then the model path with version with recursive = true (i.e. /repo/model/1). However, this will require the LocalizedPath object to be able to join the model path and model version path localization objects into one localization object.
  2. Allow the caller to specify a list of sub-paths to localize (or not localize). The "complicated" structure of sub-paths allows for fetching different combinations of sub-directories in a single call, which eliminates the need for joining model path and model version path localization objects, but the "complicated" structure is less intuitive for the caller to use and may not be easier to implement than the first design choice.

However, the interface accomplishes exactly what we want to solve at this time. I will defer to @GuanLuo and @nnshah1 to decide.

Copy link
Contributor Author

@oandreeva-nv oandreeva-nv Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, this will require the LocalizedPath object to be able to join the model path and model version path localization objects into one localization object.

Do we? If I understand correctly, versioned path is set here:

core/src/backend_model.cc

Lines 113 to 114 in a23786c

const auto version_path =
JoinPath({localized_model_path, std::to_string(version)});

so instead of

const auto version_path =
      JoinPath({localized_model_path, std::to_string(version)});

we would do something like:

 std::shared_ptr<LocalizedPath> version_path;
  RETURN_IF_ERROR(LocalizePath(path/to/version, recursive=false, &version_path));

or am I missing something?

Copy link
Contributor

@kthui kthui Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking something like:

std::string model_path = "/repo/model_dir";
std::string model_version_path = "/repo/model_dir/1";

std::shared_ptr<LocalizedPath> model_localized;
std::shared_ptr<LocalizedPath> model_version_localized;
LocalizePath(model_path, recursive=false, &model_localized);
LocalizePath(model_version_path, recursive=true, &model_version_localized);

model_localized->Include(model_version_localized);

The idea behind LocalizedPath::Include() is to join

model_tmp
|
`--- config.pbtxt

and

model_version_tmp
|
`--- model.py

to

model_tmp
|
`--- config.pbtxt
|
`--- 1
     |
     `--- model.py

on the local disk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually am missing something. Even though I like option 1 suggestion, localization in this case will happen to another directory. Meaning, at current stage every call to LocalizePath creates a new temporary directory. Thus if we separately will try to load model and its version, they will be copied in 2 different directories. So unless we want to change that, option 1 is not the best option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be ok to only implement the base case above, and throw an error if the caller is trying to join/include anything more complicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on offline discussion, with @kthui, we can add one more parameter location to control where to localize provided path, i.e. something like

LocalizePath(model_version_path, recursive-true/false, location=<string>, &model_version_localized);

Then we can control where version directory will be localized , and if location is empty, we create a new, temporary directory.

This will also help with the follow up ticket.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meaning, at current stage every call to LocalizePath creates a new temporary directory. Thus if we separately will try to load model and its version, they will be copied in 2 different directories. So unless we want to change that, option 1 is not the best option.

I am thinking on the same direction as LocalizedPath::Include() that if we want to have a generic way to localizing directories (with customization on fetching subset of contents / recurrsiveness), we want to make LocalizedPath a "filler" class: as on calling fs::LocalizePath, it returns an fully localized object (current) or empty object (with local root path assigned). Then if user want to customize the localization, they can call:

LocalizedPath::Include("sub/path/content", recursive);

Although I think it then the process is convoluted that there is a iterative calls of GetDirectorySubdirs/Files and LocalizedPath::Include

Copy link
Contributor

@GuanLuo GuanLuo Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is really an internal API (unless we want to expose file system API in the future) so we are free to modify the interface as frequent as we want and don't need to worry about being generic. If localizing model directory is somewhat specific, we may just introduce a LocalizeModelDirectory method to fit the specific need.

Because when you look at what to be localized for a model version, it is actually "everything in the directory except for the unwanted version subdirectories (integer-named directories)", that is tricky to express it with some generic language.


/// Write a string to a file.
/// \param path The path of the file.
Expand Down
5 changes: 3 additions & 2 deletions src/filesystem/implementations/as.h
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ class ASFileSystem : public FileSystem {
const std::string& path, std::set<std::string>* files) override;
Status ReadTextFile(const std::string& path, std::string* contents) override;
Status LocalizePath(
const std::string& path,
const std::string& path, const std::string& fetch_subdir,
std::shared_ptr<LocalizedPath>* localized) override;
Status WriteTextFile(
const std::string& path, const std::string& contents) override;
Expand Down Expand Up @@ -424,7 +424,8 @@ ASFileSystem::DownloadFolder(

Status
ASFileSystem::LocalizePath(
const std::string& path, std::shared_ptr<LocalizedPath>* localized)
const std::string& path, const std::string& fetch_subdir,
std::shared_ptr<LocalizedPath>* localized)
{
bool exists;
RETURN_IF_ERROR(FileExists(path, &exists));
Expand Down
3 changes: 2 additions & 1 deletion src/filesystem/implementations/common.h
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,8 @@ class FileSystem {
virtual Status ReadTextFile(
const std::string& path, std::string* contents) = 0;
virtual Status LocalizePath(
const std::string& path, std::shared_ptr<LocalizedPath>* localized) = 0;
const std::string& path, const std::string& fetch_subdir,
std::shared_ptr<LocalizedPath>* localized) = 0;
virtual Status WriteTextFile(
const std::string& path, const std::string& contents) = 0;
virtual Status WriteBinaryFile(
Expand Down
5 changes: 3 additions & 2 deletions src/filesystem/implementations/gcs.h
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ class GCSFileSystem : public FileSystem {
const std::string& path, std::set<std::string>* files) override;
Status ReadTextFile(const std::string& path, std::string* contents) override;
Status LocalizePath(
const std::string& path,
const std::string& path, const std::string& fetch_subdir,
std::shared_ptr<LocalizedPath>* localized) override;
Status WriteTextFile(
const std::string& path, const std::string& contents) override;
Expand Down Expand Up @@ -363,7 +363,8 @@ GCSFileSystem::ReadTextFile(const std::string& path, std::string* contents)

Status
GCSFileSystem::LocalizePath(
const std::string& path, std::shared_ptr<LocalizedPath>* localized)
const std::string& path, const std::string& fetch_subdir,
std::shared_ptr<LocalizedPath>* localized)
{
bool exists;
RETURN_IF_ERROR(FileExists(path, &exists));
Expand Down
5 changes: 3 additions & 2 deletions src/filesystem/implementations/local.h
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ class LocalFileSystem : public FileSystem {
const std::string& path, std::set<std::string>* files) override;
Status ReadTextFile(const std::string& path, std::string* contents) override;
Status LocalizePath(
const std::string& path,
const std::string& path, const std::string& fetch_subdir,
std::shared_ptr<LocalizedPath>* localized) override;
Status WriteTextFile(
const std::string& path, const std::string& contents) override;
Expand Down Expand Up @@ -204,7 +204,8 @@ LocalFileSystem::ReadTextFile(const std::string& path, std::string* contents)

Status
LocalFileSystem::LocalizePath(
const std::string& path, std::shared_ptr<LocalizedPath>* localized)
const std::string& path, const std::string& fetch_subdir,
std::shared_ptr<LocalizedPath>* localized)
{
// For local file system we don't actually need to download the
// directory or file. We use it in place.
Expand Down
47 changes: 27 additions & 20 deletions src/filesystem/implementations/s3.h
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@ class S3FileSystem : public FileSystem {
const std::string& path, std::set<std::string>* files) override;
Status ReadTextFile(const std::string& path, std::string* contents) override;
Status LocalizePath(
const std::string& path,
const std::string& path, const std::string& fetch_subdir,
std::shared_ptr<LocalizedPath>* localized) override;
Status WriteTextFile(
const std::string& path, const std::string& contents) override;
Expand Down Expand Up @@ -628,7 +628,8 @@ S3FileSystem::ReadTextFile(const std::string& path, std::string* contents)

Status
S3FileSystem::LocalizePath(
const std::string& path, std::shared_ptr<LocalizedPath>* localized)
const std::string& path, const std::string& fetch_subdir,
std::shared_ptr<LocalizedPath>* localized)
{
// Check if the directory or file exists
bool exists;
Expand Down Expand Up @@ -693,28 +694,34 @@ S3FileSystem::LocalizePath(
: JoinPath({(*localized)->Path(), s3_removed_path});
bool is_subdir;
RETURN_IF_ERROR(IsDirectory(s3_fpath, &is_subdir));
bool copy_subdir =
!fetch_subdir.empty()
? s3_fpath == JoinPath({effective_path, fetch_subdir})
: true;
if (is_subdir) {
// Create local mirror of sub-directories
if (copy_subdir) {
// Create local mirror of sub-directories
#ifdef _WIN32
int status = mkdir(const_cast<char*>(local_fpath.c_str()));
int status = mkdir(const_cast<char*>(local_fpath.c_str()));
#else
int status = mkdir(
const_cast<char*>(local_fpath.c_str()),
S_IRUSR | S_IWUSR | S_IXUSR);
int status = mkdir(
const_cast<char*>(local_fpath.c_str()),
S_IRUSR | S_IWUSR | S_IXUSR);
#endif
if (status == -1) {
return Status(
Status::Code::INTERNAL,
"Failed to create local folder: " + local_fpath +
", errno:" + strerror(errno));
}

// Add sub-directories and deeper files to contents
std::set<std::string> subdir_contents;
RETURN_IF_ERROR(GetDirectoryContents(s3_fpath, &subdir_contents));
for (auto itr = subdir_contents.begin(); itr != subdir_contents.end();
++itr) {
contents.insert(JoinPath({s3_fpath, *itr}));
if (status == -1) {
return Status(
Status::Code::INTERNAL,
"Failed to create local folder: " + local_fpath +
", errno:" + strerror(errno));
}

// Add sub-directories and deeper files to contents
std::set<std::string> subdir_contents;
RETURN_IF_ERROR(GetDirectoryContents(s3_fpath, &subdir_contents));
for (auto itr = subdir_contents.begin(); itr != subdir_contents.end();
++itr) {
contents.insert(JoinPath({s3_fpath, *itr}));
}
}
} else {
// Create local copy of file
Expand Down
4 changes: 2 additions & 2 deletions src/model_config_utils.cc
Original file line number Diff line number Diff line change
Expand Up @@ -918,8 +918,8 @@ LocalizePythonBackendExecutionEnvironmentPath(
model_path_slash) {
// Localize the file
std::shared_ptr<LocalizedPath> localized_exec_env_path;
RETURN_IF_ERROR(
LocalizePath(abs_exec_env_path, &localized_exec_env_path));
RETURN_IF_ERROR(LocalizePath(
abs_exec_env_path, "" /*fetch_subdir*/, &localized_exec_env_path));
// Persist the localized temporary path
(*localized_model_dir)
->other_localized_path.push_back(localized_exec_env_path);
Expand Down