Unable to calibrate settings for a deep learning model #898

theredarmy87 · 2025-01-03T03:29:37Z

Describe the bug

I have a large neural network (CNN) model with ~5 million parameters. I have not been able to do the calibrate_settings step. I have tried passing different max_logrows and scales parameters as well but had no luck:

RuntimeError: Failed to calibrate settings: [Uncategorized] calibration failed, could not find any suitable parameters given the calibration dataset

Expected behaviors

calibrate_settings to run successfully.

Steps to reproduce the bug

All the relevant files to reproduce the error are available in the following public Dropbox folder:
https://www.dropbox.com/scl/fo/hc861nn2fpfzmd3boanq6/AGIX_me24vLt-53LsRlshpc?rlkey=ae5yazlcphiu4j8nrybefibj6&dl=0

Please note that cal_data.json includes only one sample, whereas calibration_data.json includes a batch of 64 samples. Please feel free to use either one for calibration. .

Also, best_model.pth is my trained PyTorch model, but you probably don't need that since I have also included network.onnx which is the same model exported to ONNX format.

input.json is another sample data and settings.json is the setting file created after running the gen_settings code below.

base_path = os.getcwd()
model_path = os.path.join(base_path, 'network.onnx')
input_data_path = os.path.join(base_path, 'input.json')
settings_path = os.path.join(base_path, 'settings.json')
cal_data_path = os.path.join(base_path, 'cal_data.json')
calibration_data_path = os.path.join(base_path, 'calibration_data.json')

run_args = ezkl.PyRunArgs()
run_args.input_visibility = "hashed"
run_args.param_visibility = "hashed"
run_args.output_visibility = "public"
run_args.variables = [("batch_size", 1)]'

res = ezkl.gen_settings(model=model_path, output=settings_path, py_run_args=run_args)
assert res == True

The next step to calibrate settings fails.

res = await ezkl.calibrate_settings( calibration_data_path, model_path, settings_path, target="resources" )
assert res == True

Screenshot 1:

Screenshot 2:

Device and Operating System

!nvidia-smi

Additional Information

The text was updated successfully, but these errors were encountered:

theredarmy87 added the bug Something isn't working label Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to calibrate settings for a deep learning model #898

Unable to calibrate settings for a deep learning model #898

theredarmy87 commented Jan 3, 2025

Unable to calibrate settings for a deep learning model #898

Unable to calibrate settings for a deep learning model #898

Comments

theredarmy87 commented Jan 3, 2025

Describe the bug

Expected behaviors

Steps to reproduce the bug

Device and Operating System

Additional Information