Fix md5 hash for env that does not support usedforsecurity arg (openv…

…inotoolkit#445) I got an error running benchmarking on my working machine (python3.8, ubuntu20) due to unsupported args for hashlib. ``` [ ERROR ] An exception occurred [ INFO ] Traceback (most recent call last): File "benchmark.py", line 532, in main iter_data_list, pretrain_time = CASE_TO_BENCH[model_args['use_case']](model_path, framework, args.device, model_args, args.num_iters) File "benchmark.py", line 194, in run_text_generation_benchmark run_text_generation(input_text, num, model, tokenizer, args, iter_data_list, warmup_md5, prompt_idx, bench_hook, model_precision, proc_id) File "benchmark.py", line 131, in run_text_generation result_md5_list.append(hashlib.md5(result_text.encode(), usedforsecurity=False).hexdigest()) TypeError: openssl_md5() takes at most 1 argument (2 given) ``` Based on this [StackOverflow issue](https://stackoverflow.com/questions/54717862/how-do-i-know-if-the-usedforsecurity-flag-is-supported-by-hashlib-md5), not all clients support this argument and usage hashlib.new("md5") vs hashlib.md5 should be safe for usage in both cases
iefode · May 17, 2024 · 41b07d3 · 41b07d3
1 parent d473e96
commit 41b07d3
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/llm_bench/python/benchmark.py b/llm_bench/python/benchmark.py
@@ -128,7 +128,7 @@ def run_text_generation(input_text, num, model, tokenizer, args, iter_data_list,
         result_text = generated_text[bs_idx]
         if args["output_dir"] is not None:
             utils.output_file.output_gen_text(result_text, args, model_precision, prompt_index, num, bs_idx, proc_id)
-        result_md5_list.append(hashlib.md5(result_text.encode(), usedforsecurity=False).hexdigest())
+        result_md5_list.append(hashlib.new("md5", result_text.encode(), usedforsecurity=False).hexdigest())
     if num == 0:
         warmup_md5[prompt_index] = result_md5_list
     per_token_time = generation_time * 1000 / (num_tokens / args['batch_size'])