Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError if timeout occurs while evaluating metrics #332

Open
frances-h opened this issue Aug 7, 2024 · 0 comments
Open

KeyError if timeout occurs while evaluating metrics #332

frances-h opened this issue Aug 7, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@frances-h
Copy link
Contributor

Environment Details

Please indicate the following details about the environment in which you found the bug:

  • SDGym version:
  • Python version:
  • Operating System:

Error Description

When using SDGym with timeouts, a KeyError can occur if the timeout triggers while computing scores for the dataset.

Timeout running GaussianCopulaSynthesizer on dataset adult;
Timeout running GaussianCopulaSynthesizer on dataset alarm;
Timeout running GaussianCopulaSynthesizer on dataset asia;
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/frances/Documents/SDGym/sdgym/benchmark.py", line 791, in benchmark_single_table
    scores = _run_jobs(multi_processing_config, job_args_list, show_progress)
  File "/Users/frances/Documents/SDGym/sdgym/benchmark.py", line 527, in _run_jobs
    scores = pd.concat(scores, ignore_index=True)
  File "/Users/frances/.pyenv/versions/3.10.9/envs/3.10/lib/python3.10/site-packages/pandas/core/reshape/concat.py", line 382, in concat
    op = _Concatenator(
  File "/Users/frances/.pyenv/versions/3.10.9/envs/3.10/lib/python3.10/site-packages/pandas/core/reshape/concat.py", line 445, in __init__
    objs, keys = self._clean_keys_and_objs(objs, keys)
  File "/Users/frances/.pyenv/versions/3.10.9/envs/3.10/lib/python3.10/site-packages/pandas/core/reshape/concat.py", line 504, in _clean_keys_and_objs
    objs_list = list(objs)
  File "/Users/frances/.pyenv/versions/3.10.9/envs/3.10/lib/python3.10/site-packages/tqdm/std.py", line 1181, in __iter__
    for obj in iterable:
  File "/Users/frances/Documents/SDGym/sdgym/benchmark.py", line 468, in _run_job
    scores = _format_output(
  File "/Users/frances/Documents/SDGym/sdgym/benchmark.py", line 394, in _format_output
    scores.insert(len(scores.columns), score['metric'], score['normalized_score'])
KeyError: 'normalized_score'

Steps to reproduce

(May be machine dependent since it requires the timeout to occur while computing scores for the dataset)

import sdgym

sdgym.benchmark_single_table(synthesizers=['GaussianCopulaSynthesizer'], timeout=30)
@frances-h frances-h added the bug Something isn't working label Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant