Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report minimal evaluation results in run printout #1144

Closed
joaquinvanschoren opened this issue Jun 24, 2022 · 1 comment
Closed

Report minimal evaluation results in run printout #1144

joaquinvanschoren opened this issue Jun 24, 2022 · 1 comment
Labels
Good First Issue Issues suitable for people new to contributing to openml-python! Run OpenML concept

Comments

@joaquinvanschoren
Copy link
Contributor

joaquinvanschoren commented Jun 24, 2022

Description

When doing print(run), the result is only some basic info like the task type ID. It shows no aggregated evaluation information. It would be nice if that could be added to the printout.

Alternatively, add a function that returns a dictionary with the local evaluation results.
If you currently run a run locally (not downloaded from OpenML), run.evaluations is None.
run.fold_evaluations, however, contains accuracy and runtime results, but in a very inconvenient format.

Steps/Code to Reproduce

from sklearn import ensemble
from openml import tasks, runs

clf = ensemble.RandomForestClassifier()
task = tasks.get_task(3954)
run = runs.run_model_on_task(clf, task, avoid_duplicate_runs=False)
print(run)

Expected Results

Some basic evaluation info. E.g. similar to what the R API returns:

OpenML Run NA :: (Task ID = 3954, Flow ID = NA)

$bmr
task.id                    learner.id                        acc.test.join          timetrain.test.sum
1 MagicTelescope classif.randomForest     0.8831756            150.753

Actual Results

Only some basic info

OpenML Run
==========
Uploader Name: None
Metric.......: None
Run ID.......: None
Task ID......: 3954
Task Type....: None
Task URL.....: https://www.openml.org/t/3954
Flow ID......: None
Flow Name....: sklearn.ensemble._forest.RandomForestClassifier
Flow URL.....: https://www.openml.org/f/None
Setup ID.....: None
Setup String.: Python_3.7.13. Sklearn_1.0.2. NumPy_1.21.6. SciPy_1.4.1.
Dataset ID...: 1120
Dataset URL..: https://www.openml.org/d/1120

Partial solution

This returns the accuracy and runtime for a run (even if run only locally)

def summary(metric):
  mean = np.mean(list(run.fold_evaluations[metric][0].values()))
  var = np.std(list(run.fold_evaluations[metric][0].values()))
  return "{:.4f} +- {:.4f}".format(mean,var)
print("Accuracy: ",summary('predictive_accuracy'))
print("Time (ms): ",summary('usercpu_time_millis'))

Versions

Linux-5.4.188+-x86_64-with-Ubuntu-18.04-bionic
Python 3.7.13 (default, Apr 24 2022, 01:04:09)
[GCC 7.5.0]
NumPy 1.21.6
SciPy 1.4.1
Scikit-Learn 1.0.2
OpenML 0.12.2

@joaquinvanschoren joaquinvanschoren added the Good First Issue Issues suitable for people new to contributing to openml-python! label Jun 24, 2022
@mfeurer mfeurer added the Run OpenML concept label Feb 20, 2023
@mfeurer
Copy link
Collaborator

mfeurer commented Mar 1, 2023

This was added with #1214 and will be available in the next release.

@mfeurer mfeurer closed this as completed Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Good First Issue Issues suitable for people new to contributing to openml-python! Run OpenML concept
Projects
None yet
Development

No branches or pull requests

2 participants