Skip to content

Commit

Permalink
Fix md5 hash for env that does not support usedforsecurity arg (openv…
Browse files Browse the repository at this point in the history
…inotoolkit#445)

I got an error running benchmarking on my working machine (python3.8,
ubuntu20) due to unsupported args for hashlib.
```
[ ERROR ] An exception occurred
[ INFO ] Traceback (most recent call last):
  File "benchmark.py", line 532, in main
    iter_data_list, pretrain_time = CASE_TO_BENCH[model_args['use_case']](model_path, framework, args.device, model_args, args.num_iters)
  File "benchmark.py", line 194, in run_text_generation_benchmark
    run_text_generation(input_text, num, model, tokenizer, args, iter_data_list, warmup_md5, prompt_idx, bench_hook, model_precision, proc_id)
  File "benchmark.py", line 131, in run_text_generation
    result_md5_list.append(hashlib.md5(result_text.encode(), usedforsecurity=False).hexdigest())
TypeError: openssl_md5() takes at most 1 argument (2 given)
```
Based on this [StackOverflow
issue](https://stackoverflow.com/questions/54717862/how-do-i-know-if-the-usedforsecurity-flag-is-supported-by-hashlib-md5),
not all clients support this argument and usage hashlib.new("md5") vs
hashlib.md5 should be safe for usage in both cases
  • Loading branch information
eaidova authored May 17, 2024
1 parent d473e96 commit 41b07d3
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion llm_bench/python/benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ def run_text_generation(input_text, num, model, tokenizer, args, iter_data_list,
result_text = generated_text[bs_idx]
if args["output_dir"] is not None:
utils.output_file.output_gen_text(result_text, args, model_precision, prompt_index, num, bs_idx, proc_id)
result_md5_list.append(hashlib.md5(result_text.encode(), usedforsecurity=False).hexdigest())
result_md5_list.append(hashlib.new("md5", result_text.encode(), usedforsecurity=False).hexdigest())
if num == 0:
warmup_md5[prompt_index] = result_md5_list
per_token_time = generation_time * 1000 / (num_tokens / args['batch_size'])
Expand Down

0 comments on commit 41b07d3

Please sign in to comment.