Add example #791

mfeurer · 2019-10-01T11:18:50Z

adds example for Feurer et al. (2015)
removes the stub for Fusi et al. (2018) as they actually perform the
same task. I can't create an example, though, as they used regression
datasets for classification (and OpenML by now forbids creating such
tasks).

* adds example for Feurer et al. (2015) * removes the stub for Fusi et al. (2018) as they actually perform the same task. I can't create an example, though, as they used regression datasets for classification (and OpenML by now forbids creating such tasks).

codecov-io · 2019-10-01T11:48:48Z

Codecov Report

Merging #791 into develop will decrease coverage by 0.02%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           develop     #791      +/-   ##
===========================================
- Coverage    87.71%   87.68%   -0.03%     
===========================================
  Files           36       36              
  Lines         4208     4248      +40     
===========================================
+ Hits          3691     3725      +34     
- Misses         517      523       +6

Impacted Files	Coverage Δ
openml/evaluations/evaluation.py	`60.52% <0%> (-3.76%)`	⬇️
openml/extensions/sklearn/__init__.py	`100% <0%> (ø)`	⬆️
openml/extensions/sklearn/extension.py	`91.27% <0%> (+0.01%)`	⬆️
openml/evaluations/functions.py	`92.96% <0%> (+0.96%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f461732...811dfc6. Read the comment docs.

janvanrijn · 2019-10-01T15:19:22Z

examples/40_paper/2015_neurips_feurer_example.py

+]
+
+####################################################################################################
+# The dataset IDs could be used directly to load the dataset and split the data into a training


Are you sure you want to start with the dataset ids, rather than the task ids?

If the answer is yes, this clearly signals that we do not have any good procedures for "getting tasks that belong to a given set of datasets". We should either extend the API to support this better or provide the functions below as convenience function (or combination of both)

The reasoning here is to stay close to the Auto-sklearn paper, where only dataset IDs are given. What kind of convenience function would you like to have? Something like:

def get_tasks_for_dataset( dataset_id: int, task_type_id: int, estimation_procedure: str, status: str, check_target_attribute: bool, ) -> List: pass

I will also make the note more drastic.

janvanrijn · 2019-10-01T15:19:54Z

examples/40_paper/2015_neurips_feurer_example.py

+# deactivated tasks
+tasks_d = openml.tasks.list_tasks(
+    task_type_id=1,
+    status='deactivated',


why not search for status "all" ?

Lack of knowledge, I'll update the example.

janvanrijn · 2019-10-01T15:20:22Z

examples/40_paper/2015_neurips_feurer_example.py

+task_ids.sort()
+
+# These are the tasks to work with:
+print(task_ids)


logging.info?

I think print is fine for examples. logging is only important for the library itself to make the amount of output controllable.

janvanrijn

Looks good, I could not find anything big.

mfeurer · 2019-10-02T08:05:58Z

Thanks for the review, I hope I could address your comments.

ArlindKadra · 2019-10-06T13:29:01Z

examples/40_paper/2015_neurips_feurer_example.py

@@ -10,9 +10,80 @@
 ~~~~~~~~~~~

 | Efficient and Robust Automated Machine Learning
-| Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum and Frank Hutter
+| Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum and Frank Hutter  # noqa F401


@mfeurer wrong usage of # noqa F401 in the text ? It is not interpreted as a comment.
Maybe you meant here:

openml-python/examples/40_paper/2015_neurips_feurer_example.py

Line 18 in 5a2830c

import pandas as pd

ArlindKadra · 2019-10-06T13:32:38Z

examples/40_paper/2015_neurips_feurer_example.py

+]
+
+####################################################################################################
+# The dataset IDs could be used directly to load the dataset and split the data into a training


*training set

ArlindKadra · 2019-10-06T13:33:50Z

examples/40_paper/2015_neurips_feurer_example.py

+#    It is discouraged to work directly on datasets and only provide dataset IDs in a paper as
+#    this does not allow reproducibility (unclear splitting). Please do not use datasets but the
+#    respective tasks as basis for a paper and publish task IDS. This example is only given to
+#    showcase the use OpenML-Python for a published paper and as a warning on how not to do it.


the use of OpenML-Python*

Add example

926eab1

* adds example for Feurer et al. (2015) * removes the stub for Fusi et al. (2018) as they actually perform the same task. I can't create an example, though, as they used regression datasets for classification (and OpenML by now forbids creating such tasks).

mfeurer requested review from amueller, joaquinvanschoren and ArlindKadra October 1, 2019 11:18

mfeurer requested a review from janvanrijn October 1, 2019 14:51

janvanrijn reviewed Oct 1, 2019

View reviewed changes

janvanrijn requested changes Oct 1, 2019

View reviewed changes

mfeurer added 2 commits October 2, 2019 09:14

warn users of using dataset IDs, simplify code

57b6891

improve documentation of the example

811dfc6

mfeurer requested a review from janvanrijn October 2, 2019 07:23

janvanrijn approved these changes Oct 2, 2019

View reviewed changes

mfeurer mentioned this pull request Oct 2, 2019

Suggestion: add function to get tasks for a dataset? #797

Closed

mfeurer merged commit 8cc302d into develop Oct 2, 2019

mfeurer deleted the add_examples_feurer_et_al_and_fusi_et_al branch October 2, 2019 16:08

ArlindKadra reviewed Oct 6, 2019

View reviewed changes

mfeurer mentioned this pull request Oct 7, 2019

Address comment from Arlind #802

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add example #791

Add example #791

mfeurer commented Oct 1, 2019

codecov-io commented Oct 1, 2019 •

edited

Loading

janvanrijn Oct 1, 2019

mfeurer Oct 2, 2019

mfeurer Oct 2, 2019

janvanrijn Oct 1, 2019

mfeurer Oct 2, 2019

janvanrijn Oct 1, 2019

mfeurer Oct 2, 2019

janvanrijn left a comment

mfeurer commented Oct 2, 2019

ArlindKadra Oct 6, 2019

ArlindKadra Oct 6, 2019

ArlindKadra Oct 6, 2019

Add example #791

Add example #791

Conversation

mfeurer commented Oct 1, 2019

codecov-io commented Oct 1, 2019 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

janvanrijn left a comment

Choose a reason for hiding this comment

mfeurer commented Oct 2, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Oct 1, 2019 •

edited

Loading