Skip to content

Migrate export_llama to new ao quantize API #8422

@jackzhxng

Description

@jackzhxng

🚀 The feature, motivation and pitch

Int8DynActInt4WeightQuantizer for -qmode 8da4w is no longer being developed by ao and doesn't support bias. Migrate to the new quantize_ api which can take in int8_dynamic_activation_int4_weight.

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

cc @mergennachin @iseeyuan @lucylq @helunwencser @tarun292 @kimishpatel @cccclai

Metadata

Metadata

Assignees

Labels

module: examplesIssues related to demos under examples/module: llmIssues related to LLM examples and apps, and to the extensions/llm/ codetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

Status

Done

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions