Upgrade to onnx 1.9 #847

alonre24 · 2021-10-05T09:35:47Z

This is an infrastructure PR for upgrading onnxruntime to version 1.9.0
This will include @chayim work in #785 (which will be closed as from now).

We temporarily remove the use of Redis allocator as a custom allocator for onnx backend, as this feature relied on our fork onnxruntime repo. The support in Redis custom allocator will be re-introduced in #827, that is dependent on having onnx 1.9.0.

We build onnx 1.9 with the new DISABLE_EXTERNAL_INITIALIZERS flag, and test that we actually cannot load a model that uses external initializers.

into upgrade_to_onnx_1.9

codecov · 2021-10-05T10:00:43Z

Codecov Report

Merging #847 (ab20006) into master (de0f302) will increase coverage by 1.03%.
The diff coverage is 91.46%.

@@            Coverage Diff             @@
##           master     #847      +/-   ##
==========================================
+ Coverage   79.97%   81.00%   +1.03%     
==========================================
  Files          53       55       +2     
  Lines        8009     8144     +135     
==========================================
+ Hits         6405     6597     +192     
+ Misses       1604     1547      -57

Impacted Files	Coverage Δ
src/serialization/AOF/rai_aof_rewrite.c	`0.00% <0.00%> (ø)`
.../serialization/RDB/decoder/previous/v2/decode_v2.c	`46.56% <0.00%> (+9.81%)`	⬆️
src/execution/parsing/deprecated.c	`81.18% <66.66%> (+1.48%)`	⬆️
...c/serialization/RDB/decoder/current/v4/decode_v4.c	`71.42% <71.42%> (ø)`
.../serialization/RDB/decoder/previous/v0/decode_v0.c	`63.33% <72.72%> (+1.69%)`	⬆️
tests/module/LLAPI.c	`75.91% <79.06%> (+1.44%)`	⬆️
src/serialization/RDB/encoder/v4/encode_v4.c	`85.96% <81.81%> (ø)`
src/backends/tensorflow.c	`72.16% <91.00%> (+3.12%)`	⬆️
src/backends/onnxruntime.c	`73.50% <91.95%> (-3.88%)`	⬇️
src/redis_ai_objects/tensor.c	`91.93% <94.71%> (+5.24%)`	⬆️
... and 22 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3e8d7a1...ab20006. Read the comment docs.

Adding support for defining CUDA VERSIONS from outside the docker files, but still using defaults.

into upgrade_to_onnx_1.9

docs/developer-backends.md

opt/build/onnxruntime/Makefile

opt/build/onnxruntime/dockerfile.tmpl

tests/flow/includes.py

DvirDukhan · 2021-10-09T17:20:48Z

tests/flow/tests_onnx.py

+
+def test_forbidden_external_initializers(env):
+    if not TEST_ONNX:
+        env.debugPrint("skipping {} since TEST_ONNX=0".format(sys._getframe().f_code.co_name), force=True)


Suggested change

env.debugPrint("skipping {} since TEST_ONNX=0".format(sys._getframe().f_code.co_name), force=True)

env.debugPrint(f"skipping {sys._getframe().f_code.co_name} since TEST_ONNX=0", force=True)

Will do in another PR

tests/flow/tests_onnx.py

DvirDukhan · 2021-10-09T17:21:46Z

tests/flow/tests_onnx.py

+
+    # move the external initializer to the redis' current dir (tests/flow/logs)
+    external_initializer_model = load_file_content("model_with_external_initializers.onnx")
+    shutil.copy(ROOT+"/tests/flow/test_data/Pads.bin", ROOT+"/tests/flow/logs")


why is this here?

ONNX is set to look for the external initializers by default in the current working directory. When we run our tests, this location is ROOT+"/tests/flow/logs", so I copy the initializer from our test_data directory into there.

DvirDukhan · 2021-10-09T17:21:52Z

tests/flow/tests_onnx.py

+                        'AI.MODELSTORE', 'ext_initializers_model{1}', 'ONNX', DEVICE,
+                        'BLOB', external_initializer_model)
+
+    os.remove(ROOT+"/tests/flow/logs/Pads.bin")


why is this here?

ONNX is set to look for the external initializers by default in the current working directory. When we run our tests, this location is ROOT+"/tests/flow/logs", so I after we copy the initializer into there, we remove it at the end of the test

Upgrade python in xenial to 3.7 from 3.6 - just needed for onnx build system. Added help text / make help

into upgrade_to_onnx_1.9

opt/build/backends.rules

tests/flow/includes.py

into upgrade_to_onnx_1.9

DvirDukhan · 2021-10-11T20:35:14Z

docs/developer-backends.md

@@ -0,0 +1,47 @@
+# RedisAI Development Backends
+
+This document describes how a backend for RedisAI can be built, from this repository. It highlights the supported compilation devices on a per-backend basis, and highlights the tools and commands required.  Unless indicated otherwise, a backend is compiled in a docker, which is responsible for the configuration and installation of all tools required for a given backend on a per-platform basis.


why "a backend"?
we build only ONNXRuntime
maybe explain the need for building a backend. First, start with explaining that we use DISABLE_EXTERNAL_INITIALIZERS=ON and that made go and building the backend from the source. Also, explain that we do not build the other backend libraries but downloading their binaries from the library websites.

When we need to add more "backend builds" we can add additional "how-to"s

OK, I understand. I that OK if I will make the rephrasing in the next PR (#806 - the one that updates Torch and TF)?

DvirDukhan · 2021-10-11T20:36:46Z

opt/redis_valgrind.sup

+   ignore_unversioned_libs
+   Memcheck:Overlap
+   ...
+   obj:*/libonnxruntime.so*


what is the leak/error that made you introduce this?

==13852== at 0x483DA20: __memcpy_chk (vg_replace_strmem.c:1593) ==13852== by 0x7EA062B: cpuinfo_linux_parse_cpulist (in /root/project/bin/linux-x64-release/install-cpu/backends/redisai_onnxruntime/lib/libonnxruntime.so.1.9.0) ==13852== by 0x7E9C8D6: cpuinfo_linux_get_max_possible_processor (in /root/project/bin/linux-x64-release/install-cpu/backends/redisai_onnxruntime/lib/libonnxruntime.so.1.9.0) ==13852== by 0x7E9AF61: cpuinfo_x86_linux_init (in /root/project/bin/linux-x64-release/install-cpu/backends/redisai_onnxruntime/lib/libonnxruntime.so.1.9.0) ==13852== by 0x4D6C996: __pthread_once_slow (pthread_once.c:116) ==13852== by 0x7E9AE26: cpuinfo_initialize (in /root/project/bin/linux-x64-release/install-cpu/backends/redisai_onnxruntime/lib/libonnxruntime.so.1.9.0) ==13852== by 0x7CFA76A: onnxruntime::CPUIDInfo::CPUIDInfo() (in /root/project/bin/linux-x64-release/install-cpu/backends/redisai_onnxruntime/lib/libonnxruntime.so.1.9.0) ==13852== by 0x400F379: call_init.part.0 (dl-init.c:72) ==13852== by 0x400F475: call_init (dl-init.c:118) ==13852== by 0x400F475: _dl_init (dl-init.c:119) ==13852== by 0x40132D2: dl_open_worker (dl-open.c:517) ==13852== by 0x4EB2B2E: _dl_catch_exception (dl-error-skeleton.c:196) ==13852== by 0x4012BB9: _dl_open (dl-open.c:599)

This is coming from ONNX, we have it on every test that uses ONNX. It basically says (as I understand) that they use memcpy instead of memmove to copy memory from one place to another in the same memory area

chayim and others added 14 commits June 14, 2021 14:30

jetson 1.8

33e6ef5

1.8 in the Makefile

b101e59

starting single use

be2a69d

splitting a backend

ddb2111

onnx 1.8 64-bit linux, 64-bit linux with gpu, and jetson

68fdfc7

split up onnx to use docker library parts

1136450

pr comments

fffa1ea

pr comments

bbfde5b

Merge remote-tracking branch 'origin/master' into ck-onnx-1.8

af40e81

merge Chayim PR for building onnx

8743a75

WIP

1444119

Merge branch 'master' into upgrade_to_onnx_1.9

c8e49a9

remove temporarily the custom allocator support

c56aa85

Merge branch 'upgrade_to_onnx_1.9' of https://github.com/RedisAI/RedisAI

bab06d3

into upgrade_to_onnx_1.9

alonre24 mentioned this pull request Oct 5, 2021

onnxruntime 1.7.2 build and documentation #785

Closed

chayim and others added 3 commits October 5, 2021 10:42

xenial onnx 1.9 build using the template

d1c5e80

fixing python

08e8ef9

build onnx + test building with DISABLE_EXTERNAL_INITIALIZERS flag

63a318d

alonre24 added the ci-test label Oct 6, 2021

alonre24 and others added 8 commits October 6, 2021 11:45

Add the model with the external initializer

724319b

Variablizing CUDA support

c9b82a3

Adding support for defining CUDA VERSIONS from outside the docker files, but still using defaults.

use new key for deps cache

1aa4836

Merge branch 'upgrade_to_onnx_1.9' of https://github.com/RedisAI/RedisAI

f312541

into upgrade_to_onnx_1.9

gcc due to glibc versioning

2977098

s/bionic/buster/ is back

728b6b9

update deps cache key to v1.2.5

2b1f0bd

Merge branch 'upgrade_to_onnx_1.9' of https://github.com/RedisAI/RedisAI

8b4e718

into upgrade_to_onnx_1.9

alonre24 requested a review from DvirDukhan October 7, 2021 14:01

alonre24 marked this pull request as ready for review October 7, 2021 14:01

DvirDukhan requested changes Oct 9, 2021

View reviewed changes

chayim and others added 9 commits October 10, 2021 10:38

PR comments

24f7ea9

Upgrade python in xenial to 3.7 from 3.6 - just needed for onnx build system. Added help text / make help

PR fixes - documentation

063b86a

Merge branch 'upgrade_to_onnx_1.9' of https://github.com/RedisAI/RedisAI

1b3ac61

into upgrade_to_onnx_1.9

venv docs update

74bf607

get caller position when assertion fails in an auxiliary file

3e54ea2

Return filename instead of function name

7d31242

More PR fixes

c45934a

merge add tests utility

af1bcf8

Merge branch 'upgrade_to_onnx_1.9' of https://github.com/RedisAI/RedisAI

de4c368

into upgrade_to_onnx_1.9

DvirDukhan reviewed Oct 11, 2021

View reviewed changes

opt/build/backends.rules Show resolved Hide resolved

tests/flow/includes.py Outdated Show resolved Hide resolved

alonre24 added 4 commits October 11, 2021 11:32

Merge branch 'master' into upgrade_to_onnx_1.9

8958a0d

Build onnx with DISABLE_EXTERNAL_INITIALIZERS flag for GPU as well

55022ef

Merge branch 'master' into upgrade_to_onnx_1.9

b18ecec

Merge branch 'upgrade_to_onnx_1.9' of https://github.com/RedisAI/RedisAI

696747c

into upgrade_to_onnx_1.9

alonre24 removed the ci-test label Oct 11, 2021

update deps cache key in CI after rebuilding and publishing for gpu

f63f95f

alonre24 added the ci-test label Oct 11, 2021

ignore overlap valgrind errors from onnx lib

ab20006

alonre24 added ci-test and removed ci-test labels Oct 11, 2021

DvirDukhan reviewed Oct 11, 2021

View reviewed changes

DvirDukhan approved these changes Oct 12, 2021

View reviewed changes

alonre24 merged commit b04ec83 into master Oct 12, 2021

alonre24 deleted the upgrade_to_onnx_1.9 branch October 12, 2021 07:18

	env.debugPrint("skipping {} since TEST_ONNX=0".format(sys._getframe().f_code.co_name), force=True)
	env.debugPrint(f"skipping {sys._getframe().f_code.co_name} since TEST_ONNX=0", force=True)

		@@ -0,0 +1,47 @@
		# RedisAI Development Backends

		This document describes how a backend for RedisAI can be built, from this repository. It highlights the supported compilation devices on a per-backend basis, and highlights the tools and commands required. Unless indicated otherwise, a backend is compiled in a docker, which is responsible for the configuration and installation of all tools required for a given backend on a per-platform basis.

Upgrade to onnx 1.9 #847

Upgrade to onnx 1.9 #847

Uh oh!

Conversation

alonre24 commented Oct 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Oct 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alonre24 Oct 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alonre24 Oct 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alonre24 commented Oct 5, 2021 •

edited

Loading

codecov bot commented Oct 5, 2021 •

edited

Loading

alonre24 Oct 10, 2021 •

edited

Loading

alonre24 Oct 12, 2021 •

edited

Loading