Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay][Quantize] Integrate data-aware calibration into quantization #4295

Merged
merged 5 commits into from
Nov 19, 2019

Conversation

vinx13
Copy link
Member

@vinx13 vinx13 commented Nov 10, 2019

This PR did some refactor work for the calibration part: integrate evaluation script for KL, the collect stats part has been moved into internal collect_stats. New config options calibration_mode, weight_scale have been added.
Removed opt_level=3 for prerequisite_optimize, as it caused some accuracy issue when FoldScaleAxis is invoked before calibration.

Part of this PR is based on #3828.

@ZihengJiang @anijain2305 @tmoreau89

@vinx13 vinx13 force-pushed the feature/quanti_eval branch 4 times, most recently from d8c7679 to f60b25d Compare November 10, 2019 01:20
@vinx13 vinx13 force-pushed the feature/quanti_eval branch from 1729eb1 to 6d3bd3b Compare November 13, 2019 05:29
Copy link
Member

@yzhliu yzhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possible to have a test case?

python/tvm/relay/quantize/quantize.py Outdated Show resolved Hide resolved
python/tvm/relay/quantize/_calibrate.py Outdated Show resolved Hide resolved
python/tvm/relay/quantize/_calibrate.py Outdated Show resolved Hide resolved
@vinx13
Copy link
Member Author

vinx13 commented Nov 15, 2019

It's difficult to do unit tests. The refactor is covered by nightly tests. I plan to add more nightly tests for the new calibration mode later because there are some more work to do

@vinx13
Copy link
Member Author

vinx13 commented Nov 19, 2019

@yzhliu comments addressed

Copy link
Contributor

@tmoreau89 tmoreau89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, added a couple nits that can be addressed optionally

@@ -54,6 +54,8 @@ def kl_divergence_scale(arr, quantized_dtype='int8', num_bins=8001, num_quantize
http://on-demand.gputechconf.com/gtc/2017/presentation/s7310-8-bit-inference-with-tensorrt.pdf
"""
assert isinstance(arr, np.ndarray)
assert stats is not None, "scipy need to be installed for \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs

@@ -143,9 +141,20 @@ def qconfig(**kwargs):
nbit_dict: dict of QAnnotateKind -> int
Number of bit for every kind of annotate field.

calibrate_mode: str
The calibration mode. 'global_scale' or 'kl'.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we spell it out so it's more self explanatory (e.g. kullback_leibler)?

@tmoreau89
Copy link
Contributor

@vinx13 I realize that this PR might have broken the E2E VTA tutorial flow (that includes quantization for VTA hardware pipeline). I'll need to investigate, but what puzzles me more is why that wasn't caught by the CI which should be building the sphinx galleries successfully, which should include the e2e VTA tutorial.

@tmoreau89
Copy link
Contributor

The script in question is vta/tutorials/frontend/deploy_vision_on_vta.py

zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Nov 26, 2019
…pache#4295)

* [Relay][Quantize] Integrate data-aware calibration into quantization

* Update _calibrate.py

* trigger ci

* Address comments

* address comments
yongwww pushed a commit to neo-ai/tvm that referenced this pull request Nov 26, 2019
…pache#4295)

* [Relay][Quantize] Integrate data-aware calibration into quantization

* Update _calibrate.py

* trigger ci

* Address comments

* address comments
@tmoreau89
Copy link
Contributor

Fix for the VTA bug introduced is in #4433.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants