Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory-mapped Utility for PECOS XLinear Model #166

Closed

Conversation

weiliw-amz
Copy link
Contributor

@weiliw-amz weiliw-amz commented Aug 13, 2022

Issue #, if available:
N/A

Description of changes:

This pull request will consist of following 3 commits:

  • Refactor chunked matrix for PECOS XLinear model inference
    • Concatenated original chunked matrix's fragmented memory allocation
      • For accommodating subsequent memory-mapped utility module.
      • This change increases time cost of making chunked matrix by 10%~15% for large models (>50G), but is necessary and cannot be avoided.
    • Reduced memory footprint of making chunked matrix
  • Memory-mapped utility module
    • Easy to use, well-encapsulated tool designed for dumping/loading arbitrary PECOS model
  • Memory-mapped PECOS XLinear model
    • Greatly reduce loading time.
    • Ideal for large models that user want to quickly try a few inferences without waiting for loading full model into memory.
    • Also capable for large model inference that could not be stored in memory.

Usage:
User needs to have a XLinear Model saved on disk (in original .npz format), and manually compile into mmap format by calling compile_mmap_model:

import sys
from pecos.xmc.xlinear.model import XLinearModel


npz_model_path = f"/path/to/xlinear/pecos-models/"
mmap_model_path = f"/path/to/xlinear/mmap-models/"

print(f"Compiling mmap model from: {npz_model_path}, will save to : {mmap_model_path}...")
XLinearModel.compile_mmap_model(npz_model_path, mmap_model_path)
print("mmap model saved.")

Then user can load the memory-mapped model and do inference:

import sys
from pecos.xmc.xlinear.model import XLinearModel


mmap_model_path = f"/path/to/xlinear/mmap-models/"

# Load model
if sys.argv[2] == "--cmmap":
    print("Loading C/C++ mem map model...")
    xlm = XLinearModel.load(mmap_model_path, is_predict_only=True, is_mmap=True)
elif sys.argv[2] == "--cmmap-preload":
    print("Loading C/C++ mem map model pre-loaded...")
    xlm = XLinearModel.load(mmap_model_path, is_predict_only=True, is_mmap=True, pre_load=True)
else:
    print(f"Wrong option: {sys.argv[2]}")

# Load test data
Xt = XLinearModel.load_feature_matrix(f"/test/data/validation/X.npz")
Yt = XLinearModel.load_label_matrix(f"/test/data/validation/Y.npz")

# Predict
Yt_pred = xlm.predict(Xt)
Yt_pred = Yt_pred.tocsr()
metric = smat_util.Metrics.generate(Yt, Yt_pred, topk=10)
print(metric)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@weiliw-amz weiliw-amz requested a review from rofuyu August 13, 2022 03:24
@weiliw-amz weiliw-amz changed the title Memory-mapped Utility and PECOS XLinear Model Memory-mapped Utility for PECOS XLinear Model Aug 16, 2022
bool b_has_explicit_bias; // Whether or not this chunk has an explicit bias term
index_type nnz_rows; // The number of non-zero rows in this chunk
// Using index_type for struct padding
index_type b_has_explicit_bias; // Whether or not this chunk has an explicit bias term, 0=false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that this variable is no longer a boolean, we might want to consider to remove the "b_" prefix.


shutil.copyfile(path.join(npz_folder, "param.json"), path.join(mmap_folder, "param.json"))

HierarchicalMLModel.compile_mmap_model(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is compile_mmap_model implemented in HierarchicalMLModel? Also the namenpz_folder is kind of misleading.

def xlinear_load_predict_only(
self,
folder,
weight_matrix_type="BINARY_SEARCH_CHUNKED",
is_mmap=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add docstrings for the new kwargs here and HierarchicalMLModel.load and XLinearModel.load

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can this be inferred from the saved configs rather than given by the user?

"""
import shutil

shutil.copyfile(path.join(npz_folder, "param.json"), path.join(mmap_folder, "param.json"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add "compiled_format": "memory_map" in the params.json so that user need not to know the format before calling load?

@weiliw-amz weiliw-amz closed this Dec 9, 2022
@weiliw-amz
Copy link
Contributor Author

weiliw-amz commented Dec 9, 2022

Will break this PR into a series of PRs to merge:
#192
#189

@weiliw-amz weiliw-amz deleted the refactor-inference-chunked-matrix branch September 21, 2023 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants