Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforcing a 1 to 1 matching in names between Ludwig datasets and AutoGluon paper #2666

Merged
merged 2 commits into from
Oct 18, 2022

Conversation

abidwael
Copy link
Contributor

This PR

  • Modifies some of the YAMLs used to load benchmarking datasets from the AutoGluon paper. For the datasets where they submit to an online competition, they have a competition split (example). However, the scores reported in the paper are on the test split. We change the Ludwig test split for those datasets to the right file used to report scores in the paper.
  • Adds the dataset news_channel (channel in the paper).
  • Adds a few YAML files that allow us to use the same names in their work.

@abidwael abidwael requested a review from dantreiman October 18, 2022 08:34
@github-actions
Copy link

github-actions bot commented Oct 18, 2022

Unit Test Results

         6 files  ±0           6 suites  ±0   3h 8m 20s ⏱️ - 15m 8s
  3 467 tests ±0    3 345 ✔️ ±0    77 💤 ±0  45 ±0 
10 401 runs  ±0  10 102 ✔️ ±0  254 💤 ±0  45 ±0 

For more details on these failures, see this check.

Results for commit 52255ee. ± Comparison against base commit c2fb702.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@dantreiman dantreiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Also, thanks for checking the test splits against the ones used in the paper, good catch there!

@@ -0,0 +1,16 @@
version: 1.0
name: news_channel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the same as news_popularity - please double-check and remove the duplicate if so.

I agree with your choice to make naming match the paper, so probably news_popularity.yaml should be replaced by this one news_channel.yaml

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, merged before seeing this comment. Here's another PR

@abidwael abidwael merged commit 42db9a7 into master Oct 18, 2022
@abidwael abidwael deleted the dataset-loading-debug branch October 18, 2022 20:12
@abidwael abidwael restored the dataset-loading-debug branch October 18, 2022 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants