-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUANTIZE] Refactor quantization codebase and fix model accuracy #3543
Conversation
After discussing offline with Tianqi, we decide to build the nightly regression tests in another repo. |
@tqchen @eqy @vinx13 @tmoreau89 Could you please help to review this change? |
@ZihengJiang Do you think we we could try to get the calibration PR first? I have to port it over to the new pass infra and I think this is likely more easy to replay on top of calibration than vice-versa. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@ZihengJiang please followup now that #3538 is merged |
What is the status on this PR? @tqchen @ZihengJiang |
c01618d
to
c685007
Compare
481c4fa
to
46c9667
Compare
def add_partition_function(ref_call, new_args, ctx): | ||
"""Rewrite function for ewise add for partition""" | ||
if 'cuda' in _target.current_target().keys: | ||
#TODO(wuwei/ziheng) cuda specific rules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vinx13 Since general devices and VTA are okay/required to insert stop_fusion in both side, let's use different rewrite rules for specific target here,
tests/python/nightly/quantization/test_quantization_accuracy.py
Outdated
Show resolved
Hide resolved
…che#3543) * Refactor. * update * update * update * update * update * update
…che#3543) * Refactor. * update * update * update * update * update * update
…che#3543) * Refactor. * update * update * update * update * update * update
partition.cc
,annotate.cc
,realize.cc
rewrite_for_vta
to extrapartition
pass and enable it by defaultannotation.force_cast(x)
toannotation.cast_hint(x, dtype)
qconfig.store_lowbit_output
and enable it by defaultcc @tqchen @eqy @vinx13 @tmoreau89