Update engine hash id generator with model name/model content/metadata #13015

yf711 · 2022-09-19T19:57:43Z

Update engine hash id generator with model name/model content/metadata

Description:

Updated engine id generator, which use model name/model inputs & outputs/env metadata (instead of model path) to generate hash
New bridged API were introduced in order to enable id generator in the TRTEP utility

Motivation and Context

Why is this change required? What problem does it solve? To fix this issue caused by id generator using model path

How to use:

Call TRTGenerateMetaDefId(const GraphViewer& graph_viewer, HashValue& model_hash) to generate hash id for TRT engine cache

How to test:

On WIndows, run:
- .\onnxruntime_test_all.exe --gtest_filter=TensorrtExecutionProviderTest.TRTMetadefIdGeneratorUsingModelHashing
- .\onnxruntime_test_all.exe --gtest_filter=TensorrtExecutionProviderTest.TRTSubgraphIdGeneratorUsingModelHashing

Appendix

Existing engine id generator that uses model path

…adata

chilo-ms · 2022-09-20T16:51:47Z

As we discussed in yesterday's meeting, please also add following changes:

Replace GenerateMetaDefId() with TRTGenerateMetaDefId() in tensorrt_execution_provider.cc
Add one more test case where two same models placed in different path, but the hash values should be the same.

It's also better to check the model with multiple partitions/engines, for example, FasterRCNN and MaskRCNN.

chilo-ms · 2022-09-20T22:42:16Z

onnxruntime/core/providers/shared_library/provider_interfaces.h

@@ -665,6 +665,10 @@ struct ProviderHost {
  virtual bool Graph__GetInitializedTensor(const Graph* p, const std::string& tensor_name, const ONNX_NAMESPACE::TensorProto*& value) = 0;

  virtual const Node* Graph__ParentNode(const Graph* p) const = 0;
+  virtual const Graph* Graph__ParentGraph(const Graph* p) const = 0;
+  virtual const std::string& Graph__Name(const Graph* p) noexcept = 0;


I suggest make it:
virtual const std::string& Graph__Name(const Graph* p) const noexcept = 0;
to be aligned with ORT Graph API.

Same for:
virtual const std::vector<const NodeArg*>& Graph__GetInputsIncludingInitializers(const Graph* p) const noexcept = 0;

Thanks for pointing that out. Those have been included in the 2nd commit

jywu-msft · 2022-09-21T02:45:25Z

Can you address the cpplint warnings (if they make sense)?

chilo-ms · 2022-09-21T06:13:38Z

onnxruntime/test/providers/tensorrt/tensorrt_basic_test.cc

+
+  // get the hash for the model when loaded from file
+  HashValue model_hash;
+  int id = TRTGenerateMetaDefId(viewer, model_hash);


That's good to test the correctness and functionality of TRTGenerateMetaDefId().

I think it's better to test the TRT EP inference run and check the hash file name of the engine cache in the filesystem as well, either manually or in the unit test. So that we can make sure the code path of TRT EP calling TRTGenerateMetaDefId() is okay.

This has been manually verified by running FastRCNN locally. Thanks for sharing the instruction!

pranavsharma · 2022-09-21T18:20:15Z

Do we've plans to test this in the Triton context? Here's the issue.

chilo-ms · 2022-09-21T18:33:24Z

Do we've plans to test this in the Triton context? Here's the issue.

What's the best way to verify the fix in theTriton context? I'm not familiar with the Triton testing, could they simply pick up the main branch and test?
We have tested that model path has no impact on the hash id from our side.

#13015) **Update engine hash id generator with model name/model content/metadata** **Description**: * Updated engine id generator, which use model name/model inputs & outputs/env metadata (instead of model path) to generate hash * New bridged API were introduced in order to enable id generator in the TRTEP utility **Motivation and Context** - Why is this change required? What problem does it solve? To fix this [issue](triton-inference-server/server#4587) caused by id generator using model path How to use: * Call [TRTGenerateMetaDefId(const GraphViewer& graph_viewer, HashValue& model_hash)](https://github.com/microsoft/onnxruntime/blob/0fcce74a565478b4c83fac5a3230e9786bb53ab3/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc#L715) to generate hash id for TRT engine cache How to test: * On WIndows, run: * .\onnxruntime_test_all.exe --gtest_filter=TensorrtExecutionProviderTest.TRTMetadefIdGeneratorUsingModelHashing * .\onnxruntime_test_all.exe --gtest_filter=TensorrtExecutionProviderTest.TRTSubgraphIdGeneratorUsingModelHashing **Appendix** * [Existing engine id generator that uses model path](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/framework/execution_provider.cc#L112-L182)

There are some issues in #13015, 1. Model name should be used rather than graph name in the model ID generator. 2. Hash collision is observed in ID cache, which means different model may have the same key and thus load same hash id from the cache. 3. For the class and function that generate model id, MetaDef in the name is not appropriate. 4. Should reuse murmurhash3 rather than copy it over to TRT EP This PR fixes those issues.

There are some issues in microsoft#13015, 1. Model name should be used rather than graph name in the model ID generator. 2. Hash collision is observed in ID cache, which means different model may have the same key and thus load same hash id from the cache. 3. For the class and function that generate model id, MetaDef in the name is not appropriate. 4. Should reuse murmurhash3 rather than copy it over to TRT EP This PR fixes those issues.

There are some issues in #13015, 1. Model name should be used rather than graph name in the model ID generator. 2. Hash collision is observed in ID cache, which means different model may have the same key and thus load same hash id from the cache. 3. For the class and function that generate model id, MetaDef in the name is not appropriate. 4. Should reuse murmurhash3 rather than copy it over to TRT EP This PR fixes those issues.

Update engine hash id generator with model name/model content/env met…

21b093d

…adata

jywu-msft requested a review from chilo-ms September 20, 2022 01:43

chilo-ms reviewed Sep 20, 2022

View reviewed changes

yf711 added 2 commits September 20, 2022 16:20

Add two test cases and enable updated engine id generator in TRTEP

0fcce74

Adjust comments and unused variable

597ffee

jywu-msft requested a review from stevenlix September 21, 2022 02:47

chilo-ms reviewed Sep 21, 2022

View reviewed changes

chilo-ms approved these changes Sep 21, 2022

View reviewed changes

yf711 merged commit 240aead into main Sep 21, 2022

yf711 deleted the yifanl/updateEngineHashIdGenerator branch September 21, 2022 18:10

stevenlix mentioned this pull request Dec 5, 2022

Fix issues in TRT model ID generator #13837

Merged

yf711 restored the yifanl/updateEngineHashIdGenerator branch December 12, 2022 09:17

yf711 deleted the yifanl/updateEngineHashIdGenerator branch February 8, 2023 17:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update engine hash id generator with model name/model content/metadata #13015

Update engine hash id generator with model name/model content/metadata #13015

yf711 commented Sep 19, 2022 •

edited

Loading

chilo-ms commented Sep 20, 2022 •

edited

Loading

chilo-ms Sep 20, 2022 •

edited

Loading

yf711 Sep 20, 2022

jywu-msft commented Sep 21, 2022 •

edited

Loading

chilo-ms Sep 21, 2022 •

edited

Loading

yf711 Sep 21, 2022

pranavsharma commented Sep 21, 2022

chilo-ms commented Sep 21, 2022 •

edited

Loading

Update engine hash id generator with model name/model content/metadata #13015

Update engine hash id generator with model name/model content/metadata #13015

Conversation

yf711 commented Sep 19, 2022 • edited Loading

chilo-ms commented Sep 20, 2022 • edited Loading

chilo-ms Sep 20, 2022 • edited Loading

Choose a reason for hiding this comment

yf711 Sep 20, 2022

Choose a reason for hiding this comment

jywu-msft commented Sep 21, 2022 • edited Loading

chilo-ms Sep 21, 2022 • edited Loading

Choose a reason for hiding this comment

yf711 Sep 21, 2022

Choose a reason for hiding this comment

pranavsharma commented Sep 21, 2022

chilo-ms commented Sep 21, 2022 • edited Loading

yf711 commented Sep 19, 2022 •

edited

Loading

chilo-ms commented Sep 20, 2022 •

edited

Loading

chilo-ms Sep 20, 2022 •

edited

Loading

jywu-msft commented Sep 21, 2022 •

edited

Loading

chilo-ms Sep 21, 2022 •

edited

Loading

chilo-ms commented Sep 21, 2022 •

edited

Loading