Prevent data copy in CreateSessionFromArray #8328

slevental · 2021-07-08T07:47:10Z

Feature Request

Is your feature request related to a problem? Please describe.
We are using OnnxRuntime on the mobile environment (For some mobile apps (extensions on iOS) there is a memory limitation ~60Mb, a process that uses more - gets killed by OS.);

In this environment using less ram is critical, so we had to investigate the onnxruntime memory usage patterns; we find out that it's impossible to use mmapped models in the onnxrumtime; even if mmaped on the client-side and passed to CreateSessionFromArray - the onnxruntime copies the memory into internal data structure;

System information

ONNX Runtime version (you are using): 1.8.0 (C api)

Describe the solution you'd like
To tackle this it's possible to use os file cache and mmap filed into memory (TensorFlowLite supports that for instance)

Describe alternatives you've considered
There is no alternative unfortunately; we only can migrate to another framework

guoyu-wang · 2021-07-16T00:36:19Z

We are actively looking into this issue.

@slevental, to better understand your issue, would you please give us a bit more context? For example, the ~60Mb memory
limitation for your App, is this an Apple policy? Does the App gets killed when the peak working set exceeds the limitation?

Also if it is possible, could you share the model, such that we can measure memory consumption based on your real world scenario.

slevental · 2021-07-23T13:32:46Z

@gwang-msft hey, thanks for getting back.

This is an Apple limitation for extensions: when app uses more memory than 60Mb - it's getting killed by OS. This includes everything graphics/ui and internal data structures; One of the solutions that were possible is to mmap model (supported by TensorflowLite); I think it's something that would be useful for onnxruntime to have to reduce memory consumption; we tried to mmap the model and use CreateSessionFromArray as it can take an address and load model from there, but in the code, we found that onnxruntime copies this memory into the RES memory of the process - this defeats the purpose.

Unfortunately, we cannot share the model, but it is just the semantics of the API.

slevental · 2021-07-23T13:39:53Z

here is the code that does that:

onnxruntime/onnxruntime/core/session/inference_session.cc

Line 1006 in 3360024

    
           std::copy_n(reinterpret_cast<const uint8_t*>(model_data), model_data_len, ort_format_model_bytes_.data());

slevental · 2021-07-23T13:43:05Z

I think if onnxruntime might provide mmap usage of the models (with ORT is based on flatbuffers this should be straightforward) that also would be a possible solution

guoyu-wang · 2021-07-27T16:17:01Z

With the latest master branch, you may specify the session config option by calling this API

onnxruntime/include/onnxruntime/core/session/onnxruntime_c_api.h

Lines 1021 to 1022 in 4c939e1

    
           ORT_API2_STATUS(AddSessionConfigEntry, _Inout_ OrtSessionOptions* options, 
        
                           _In_z_ const char* config_key, _In_z_ const char* config_value);

With this session config key

onnxruntime/include/onnxruntime/core/session/onnxruntime_session_options_config_keys.h

Line 70 in 4c939e1

    
           static const char* const kOrtSessionOptionsConfigUseORTModelBytesDirectly = "session.use_ort_model_bytes_directly";

and value "1" to use the input buffer directly and avoid copy,

You will have to ensure the validity of the input buffer throughout the lifetime of the inference session.

Also for the model, could you share some stats of the model if the model itself cannot be shared, such as the rough size of the model and the initializer tensors, does the model contains nodes such as ConstantOfShape?, ...

kit1980 added core runtime issues related to core runtime feature request request for unsupported feature or enhancement labels Jul 8, 2021

guoyu-wang added the platform:mobile issues related to ONNX Runtime mobile; typically submitted using template label Jul 15, 2021

manashgoswami assigned guoyu-wang Jul 15, 2021

guoyu-wang mentioned this issue Jul 27, 2021

Add an option to use the input model bytes (ORT format only) directly without copy at session creation #8502

Merged

skottmckay closed this as completed Oct 3, 2023

woaixiaoxiao mentioned this issue Sep 1, 2024

CreateSessionFromArray doesn't work #21946

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent data copy in CreateSessionFromArray #8328

Prevent data copy in CreateSessionFromArray #8328

slevental commented Jul 8, 2021 •

edited

Loading

guoyu-wang commented Jul 16, 2021 •

edited

Loading

slevental commented Jul 23, 2021

slevental commented Jul 23, 2021

slevental commented Jul 23, 2021

guoyu-wang commented Jul 27, 2021 •

edited

Loading

Prevent data copy in CreateSessionFromArray #8328

Prevent data copy in CreateSessionFromArray #8328

Comments

slevental commented Jul 8, 2021 • edited Loading

guoyu-wang commented Jul 16, 2021 • edited Loading

slevental commented Jul 23, 2021

slevental commented Jul 23, 2021

slevental commented Jul 23, 2021

guoyu-wang commented Jul 27, 2021 • edited Loading

slevental commented Jul 8, 2021 •

edited

Loading

guoyu-wang commented Jul 16, 2021 •

edited

Loading

guoyu-wang commented Jul 27, 2021 •

edited

Loading