Added --iteration and --automation flags #512

asmigosw · 2025-07-10T08:35:18Z

Added flags:

--iteration: Number of iterations to run the inference after loading the QPC once.
--automation: If true, it prints input, output, and performance stats.

Example command: python -m QEfficient.cloud.infer --model_name gpt2 --batch_size 1 --prompt_len 32 --ctx_len 128 --mxfp6 --num_cores 16 --device_group [0] --prompt "My name is" --mos 1 --aic_enable_depth_first --iteration 2 --automation

Signed-off-by: Asmita Goswami <asmigosw@qti.qualcomm.com>

Added --iteration and --automation flags

125636c

Signed-off-by: Asmita Goswami <asmigosw@qti.qualcomm.com>

asmigosw requested review from quic-rishinr, ochougul, quic-hemagnih and quic-amitraj as code owners July 10, 2025 08:35

Merge branch 'main' into flags_update

54990ec

quic-rishinr marked this pull request as draft July 10, 2025 10:04

quic-rishinr assigned quic-rishinr and asmigosw and unassigned quic-rishinr Jul 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added --iteration and --automation flags #512

Added --iteration and --automation flags #512

asmigosw commented Jul 10, 2025

Uh oh!

Uh oh!

Added --iteration and --automation flags #512

Are you sure you want to change the base?

Added --iteration and --automation flags #512

Conversation

asmigosw commented Jul 10, 2025

Uh oh!

Uh oh!