diff --git a/README.md b/README.md index 2380aff..7e66b99 100644 --- a/README.md +++ b/README.md @@ -25,8 +25,8 @@ pip install -e DEHB # -e stands for editable, lets you modify the code and reru To run PyTorch example: (*note additional requirements*) ```bash python examples/03_pytorch_mnist_hpo.py \ - --min_budget 1 \ - --max_budget 3 \ + --min_fidelity 1 \ + --max_fidelity 3 \ --runtime 60 \ --verbose ``` @@ -62,8 +62,8 @@ to it by that DEHB run. To run the PyTorch MNIST example on a single node using 2 workers: ```bash python examples/03_pytorch_mnist_hpo.py \ - --min_budget 1 \ - --max_budget 3 \ + --min_fidelity 1 \ + --max_fidelity 3 \ --runtime 60 \ --n_workers 2 \ --single_node_with_gpus \ @@ -96,8 +96,8 @@ bash utils/run_dask_setup.sh \ # Make sure to sleep to allow the workers to setup properly sleep 5 python examples/03_pytorch_mnist_hpo.py \ - --min_budget 1 \ - --max_budget 3 \ + --min_fidelity 1 \ + --max_fidelity 3 \ --runtime 60 \ --scheduler_file dask_dump/scheduler.json \ --verbose @@ -111,9 +111,9 @@ and were found to be *generally* useful across all cases tested. However, the parameters are still available for tuning to a specific problem. The Hyperband components: -* *min\_budget*: Needs to be specified for every DEHB instantiation and is used in determining -the budget spacing for the problem at hand. -* *max\_budget*: Needs to be specified for every DEHB instantiation. Represents the full-budget +* *min\_fidelity*: Needs to be specified for every DEHB instantiation and is used in determining +the fidelity spacing for the problem at hand. +* *max\_fidelity*: Needs to be specified for every DEHB instantiation. Represents the full-fidelity evaluation or the actual black-box setting. * *eta*: (default=3) Sets the aggressiveness of Hyperband's aggressive early stopping by retaining 1/eta configurations every round diff --git a/docs/index.md b/docs/index.md index 297e931..c5d94f1 100644 --- a/docs/index.md +++ b/docs/index.md @@ -42,16 +42,16 @@ Next, we need an `object_function`, which we are aiming to optimize: ```python exec="true" source="material-block" result="python" title="Configuration Space" session="someid" import numpy as np -def objective_function(x: Configuration, budget: float, **kwargs): +def objective_function(x: Configuration, fidelity: float, **kwargs): # Replace this with your actual objective value (y) and cost. - cost = (10 if x["x1"] == "red" else 100) + budget + cost = (10 if x["x1"] == "red" else 100) + fidelity y = x["x0"] + np.random.uniform() return {"fitness": y, "cost": x["x0"]} sample_config = cs.sample_configuration() print(sample_config) -result = objective_function(sample_config, budget=10) +result = objective_function(sample_config, fidelity=10) print(result) ``` @@ -65,8 +65,8 @@ optimizer = DEHB( f=objective_function, cs=cs, dimensions=dim, - min_budget=3, - max_budget=27, + min_fidelity=3, + max_fidelity=27, eta=3, n_workers=1, output_path="./logs", @@ -74,11 +74,11 @@ optimizer = DEHB( # Run optimization for 1 bracket. Output files will be saved to ./logs traj, runtime, history = optimizer.run(brackets=1, verbose=True) -config, fitness, runtime, budget, _ = history[0] +config, fitness, runtime, fidelity, _ = history[0] print("config", config) print("fitness", fitness) print("runtime", runtime) -print("budget", budget) +print("fidelity", fidelity) ``` ### Running DEHB in a parallel setting @@ -112,8 +112,8 @@ to it by that DEHB run. To run the PyTorch MNIST example on a single node using 2 workers: ```bash python examples/03_pytorch_mnist_hpo.py \ - --min_budget 1 \ - --max_budget 3 \ + --min_fidelity 1 \ + --max_fidelity 3 \ --runtime 60 \ --n_workers 2 \ --single_node_with_gpus \ @@ -147,8 +147,8 @@ bash utils/run_dask_setup.sh \ sleep 5 python examples/03_pytorch_mnist_hpo.py \ - --min_budget 1 \ - --max_budget 3 \ + --min_fidelity 1 \ + --max_fidelity 3 \ --runtime 60 \ --scheduler_file dask_dump/scheduler.json \ --verbose diff --git a/examples/00_interfacing_DEHB.ipynb b/examples/00_interfacing_DEHB.ipynb index 350087c..807b124 100644 --- a/examples/00_interfacing_DEHB.ipynb +++ b/examples/00_interfacing_DEHB.ipynb @@ -40,7 +40,7 @@ "\n", "DEHB also uses Hyperband along with DE, to allow for cheaper approximations of the actual evaluations of $x$. Let $f(x)$ be the validation error of training a multilayer perceptron (MLP) on the complete training set. Multi-fidelity algorithms such as Hyperband, allow for cheaper approximations along a possible *fidelity*. For the MLP, a subset of the dataset maybe a cheaper approximation to the full data set evaluation. Whereas the fidelity can be quantifies as the fraction of the dataset used to evaluate the configuration $x$, instead of the full dataset. Such approximations can allow sneak-peek into the black-box, potentially revealing certain landscape feature of *f(x)*, thus rendering it a *gray*-box and not completely opaque and black! \n", "\n", - "The $z$ parameter is the fidelity parameter to the black-box function. If $z \\in [budget_{min}, budget_{max}]$, then $f(x, budget_{max})$ would be equivalent to the black-box case of $f(x)$.\n", + "The $z$ parameter is the fidelity parameter to the black-box function. If $z \\in [fidelity_{min}, fidelity_{max}]$, then $f(x, fidelity_{max})$ would be equivalent to the black-box case of $f(x)$.\n", "\n", "![boxes](imgs/black-gray-box.png)" ] @@ -62,7 +62,7 @@ "source": [ "def target_function(\n", " x: Union[ConfigSpace.Configuration, List, np.array], \n", - " budget: Union[int, float] = None,\n", + " fidelity: Union[int, float] = None,\n", " **kwargs\n", ") -> Dict:\n", " \"\"\" Target/objective function to optimize\n", @@ -70,7 +70,7 @@ " Parameters\n", " ----------\n", " x : configuration that DEHB wants to evaluate\n", - " budget : parameter determining cheaper evaluations\n", + " fidelity : parameter determining cheaper evaluations\n", " \n", " Returns\n", " -------\n", @@ -83,7 +83,7 @@ " # remove the code snippet below\n", " start = time.time()\n", " y = np.random.uniform() # placeholder response of evaluation\n", - " time.sleep(budget) # simulates runtime (mostly proportional to fidelity)\n", + " time.sleep(fidelity) # simulates runtime (mostly proportional to fidelity)\n", " cost = time.time() - start\n", " \n", " # result dict passed to DE/DEHB as function evaluation output\n", @@ -171,8 +171,9 @@ { "data": { "text/plain": [ - "Configuration:\n", - " x0, Value: 3.716302229868112" + "Configuration(values={\n", + " 'x0': 8.107160631154175,\n", + "})" ] }, "execution_count": 5, @@ -198,7 +199,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Defining fidelity/budget range for the target function" + "### Defining fidelity range for the target function" ] }, { @@ -207,7 +208,7 @@ "metadata": {}, "outputs": [], "source": [ - "min_budget, max_budget = (0.1, 3) " + "min_fidelity, max_fidelity = (0.1, 3) " ] }, { @@ -244,8 +245,8 @@ " f=target_function,\n", " dimensions=dimensions,\n", " cs=cs,\n", - " min_budget=min_budget,\n", - " max_budget=max_budget,\n", + " min_fidelity=min_fidelity,\n", + " max_fidelity=max_fidelity,\n", " output_path=\"./temp\",\n", " n_workers=1 # set to >1 to utilize parallel workers\n", ")\n", @@ -281,9 +282,9 @@ "name": "stdout", "output_type": "stream", "text": [ - "Configuration:\n", - " x0, Value: 4.060258498267547\n", - "\n" + "Configuration(values={\n", + " 'x0': 4.152073449922892,\n", + "})\n" ] } ], @@ -308,14 +309,14 @@ "name": "stdout", "output_type": "stream", "text": [ - "2021-10-22 14:45:56.117 | INFO | dehb.optimizers.dehb:reset:107 - \n", - "\n", - "RESET at 10/22/21 14:45:56 CEST\n", + "\u001b[32m2023-10-22 20:03:06.057\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mdehb.optimizers.dehb\u001b[0m:\u001b[36mreset\u001b[0m:\u001b[36m121\u001b[0m - \u001b[1m\n", "\n", + "RESET at 10/22/23 20:03:06 CEST\n", "\n", - "(Configuration:\n", - " x0, Value: 3.724555206841792\n", - ", 0.0938589687572785)\n" + "\u001b[0m\n", + "(Configuration(values={\n", + " 'x0': 8.96840263375364,\n", + "}), 0.05819975786653586)\n" ] } ], @@ -343,14 +344,14 @@ "name": "stdout", "output_type": "stream", "text": [ - "2021-10-22 14:45:58.567 | INFO | dehb.optimizers.dehb:reset:107 - \n", - "\n", - "RESET at 10/22/21 14:45:58 CEST\n", + "\u001b[32m2023-10-22 20:03:11.073\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mdehb.optimizers.dehb\u001b[0m:\u001b[36mreset\u001b[0m:\u001b[36m121\u001b[0m - \u001b[1m\n", "\n", + "RESET at 10/22/23 20:03:11 CEST\n", "\n", - "(Configuration:\n", - " x0, Value: 4.341818535733585\n", - ", 3.653636256717441e-05)\n" + "\u001b[0m\n", + "(Configuration(values={\n", + " 'x0': 8.708444163420975,\n", + "}), 0.0710929937087792)\n" ] } ], @@ -381,9 +382,9 @@ "name": "stdout", "output_type": "stream", "text": [ - "(Configuration:\n", - " x0, Value: 4.610766436763522\n", - ", 0.007774399252232556)\n" + "(Configuration(values={\n", + " 'x0': 8.454086817115218,\n", + "}), 0.016305791635409683)\n" ] } ], @@ -392,8 +393,8 @@ " f=target_function,\n", " dimensions=dimensions,\n", " cs=cs,\n", - " min_budget=min_budget,\n", - " max_budget=max_budget,\n", + " min_fidelity=min_fidelity,\n", + " max_fidelity=max_fidelity,\n", " output_path=\"./temp\",\n", " n_workers=2\n", ")\n", @@ -413,9 +414,9 @@ "name": "stdout", "output_type": "stream", "text": [ - "Configuration:\n", - " x0, Value: 4.610766436763522\n", - "\n" + "Configuration(values={\n", + " 'x0': 8.454086817115218,\n", + "})\n" ] } ], @@ -432,10 +433,10 @@ "name": "stdout", "output_type": "stream", "text": [ - "0.007774399252232556 0.007774399252232556\n", - "Configuration:\n", - " x0, Value: 4.610766436763522\n", - "\n" + "0.016305791635409683 0.016305791635409683\n", + "Configuration(values={\n", + " 'x0': 8.454086817115218,\n", + "})\n" ] } ], @@ -454,7 +455,7 @@ "\n", "As detailed above, the problem definition needs to be input to DEHB as the following information:\n", "* the *target_function* (`f`) that is the primary black-box function to optimize\n", - "* the fidelity range of `min_budget` and `max_budget` that allows the cheaper, faster gray-box optimization of `f`\n", + "* the fidelity range of `min_fidelity` and `max_fidelity` that allows the cheaper, faster gray-box optimization of `f`\n", "* the search space or the input domain of the function `f`, that can be represented as a `ConfigSpace` object and passed to DEHB at initialization\n", "\n", "\n", @@ -465,13 +466,20 @@ "\n", "DEHB will terminate once its chosen runtime budget is exhausted, and report the incumbent found. DEHB, as an *anytime* algorithm, constantly writes to disk a lightweight `json` file with the best found configuration and its score seen till that point." ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { "kernelspec": { - "display_name": "dask", + "display_name": "Python 3 (ipykernel)", "language": "python", - "name": "dask" + "name": "python3" }, "language_info": { "codemirror_mode": { @@ -483,7 +491,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.9" + "version": "3.9.16" } }, "nbformat": 4, diff --git a/examples/01_Optimizing_RandomForest_using_DEHB.ipynb b/examples/01_Optimizing_RandomForest_using_DEHB.ipynb index c35427b..e5bd359 100644 --- a/examples/01_Optimizing_RandomForest_using_DEHB.ipynb +++ b/examples/01_Optimizing_RandomForest_using_DEHB.ipynb @@ -37,7 +37,7 @@ "* `min_samples_split`\n", "* `max_features`\n", "* `min_samples_leaf`\n", - "while the `n_estimators` hyperparameter to the Random Forest is chosen to be a fidelity parameter instead. Lesser number of trees ($<10$) in the Random Forest may not allow adequate ensembling for the grouped prediction to be significantly better than the individual tree predictions. Whereas a large number of trees (~$100$) often give accurate predictions but is naturally slower to train and predict on account of more trees to train. Therefore, a smaller `n_estimators` can be used as a cheaper approximation of the actual budget of `n_estimators=100`." + "while the `n_estimators` hyperparameter to the Random Forest is chosen to be a fidelity parameter instead. Lesser number of trees ($<10$) in the Random Forest may not allow adequate ensembling for the grouped prediction to be significantly better than the individual tree predictions. Whereas a large number of trees (~$100$) often give accurate predictions but is naturally slower to train and predict on account of more trees to train. Therefore, a smaller `n_estimators` can be used as a cheaper approximation of the actual fidelity of `n_estimators=100`." ] }, { @@ -53,7 +53,7 @@ "metadata": {}, "outputs": [], "source": [ - "min_budget, max_budget = 2, 50" + "min_fidelity, max_fidelity = 2, 50" ] }, { @@ -147,7 +147,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now the primary black/gray-box interface to the Random Forest model needs to be built for DEHB to query. As given in the `00_interfacing_DEHB` notebook, this function will have a signature akin to: `target_function(config, budget)`, and return a two-element tuple of the `score` and `cost`. It must be noted that DEHB **minimizes** and therefore the `score` being returned by this `target_function` should account for it." + "Now the primary black/gray-box interface to the Random Forest model needs to be built for DEHB to query. As given in the `00_interfacing_DEHB` notebook, this function will have a signature akin to: `target_function(config, fidelity)`, and return a two-element tuple of the `score` and `cost`. It must be noted that DEHB **minimizes** and therefore the `score` being returned by this `target_function` should account for it." ] }, { @@ -273,23 +273,23 @@ "metadata": {}, "outputs": [], "source": [ - "def target_function(config, budget, **kwargs):\n", + "def target_function(config, fidelity, **kwargs):\n", " # Extracting support information\n", " seed = kwargs[\"seed\"]\n", " train_X = kwargs[\"train_X\"]\n", " train_y = kwargs[\"train_y\"]\n", " valid_X = kwargs[\"valid_X\"]\n", " valid_y = kwargs[\"valid_y\"]\n", - " max_budget = kwargs[\"max_budget\"]\n", + " max_fidelity = kwargs[\"max_fidelity\"]\n", " \n", - " if budget is None:\n", - " budget = max_budget\n", + " if fidelity is None:\n", + " fidelity = max_fidelity\n", " \n", " start = time.time()\n", " # Building model \n", " model = RandomForestClassifier(\n", " **config.get_dictionary(),\n", - " n_estimators=int(budget),\n", + " n_estimators=int(fidelity),\n", " bootstrap=True,\n", " random_state=seed,\n", " )\n", @@ -308,7 +308,7 @@ " \"cost\": cost,\n", " \"info\": {\n", " \"test_score\": test_accuracy,\n", - " \"budget\": budget\n", + " \"fidelity\": fidelity\n", " }\n", " }\n", " return result" @@ -340,8 +340,8 @@ " f=target_function, \n", " cs=cs, \n", " dimensions=dimensions, \n", - " min_budget=min_budget, \n", - " max_budget=max_budget,\n", + " min_fidelity=min_fidelity, \n", + " max_fidelity=max_fidelity,\n", " n_workers=1,\n", " output_path=\"./temp\"\n", ")" @@ -363,7 +363,7 @@ " train_y=train_y,\n", " valid_X=valid_X,\n", " valid_y=valid_y,\n", - " max_budget=dehb.max_budget\n", + " max_fidelity=dehb.max_fidelity\n", ")" ] }, @@ -376,17 +376,16 @@ "name": "stdout", "output_type": "stream", "text": [ - "473 473 473\n", + "454 454 454\n", "\n", "Last evaluated configuration, \n", "Configuration(values={\n", - " 'max_depth': 7,\n", - " 'max_features': 0.669059250229961,\n", - " 'min_samples_leaf': 2,\n", - " 'min_samples_split': 3,\n", - "})\n", - "got a score of -1.0, was evaluated at a budget of 50.00 and took 0.048 seconds to run.\n", - "The additional info attached: {'test_score': 1.0, 'budget': 50.0}\n" + " 'max_depth': 6,\n", + " 'max_features': 0.6215565437234547,\n", + " 'min_samples_leaf': 8,\n", + " 'min_samples_split': 4,\n", + "})got a score of -1.0, was evaluated at a fidelity of 16.67 and took 0.016 seconds to run.\n", + "The additional info attached: {'test_score': 1.0, 'fidelity': 16.666666666666664}\n" ] } ], @@ -395,12 +394,12 @@ "\n", "# Last recorded function evaluation\n", "last_eval = history[-1]\n", - "config, score, cost, budget, _info = last_eval\n", + "config, score, cost, fidelity, _info = last_eval\n", "\n", "print(\"Last evaluated configuration, \")\n", "print(dehb.vector_to_configspace(config), end=\"\")\n", - "print(\"got a score of {}, was evaluated at a budget of {:.2f} and \"\n", - " \"took {:.3f} seconds to run.\".format(score, budget, cost))\n", + "print(\"got a score of {}, was evaluated at a fidelity of {:.2f} and \"\n", + " \"took {:.3f} seconds to run.\".format(score, fidelity, cost))\n", "print(\"The additional info attached: {}\".format(_info))" ] }, @@ -420,29 +419,29 @@ "name": "stdout", "output_type": "stream", "text": [ - "\u001b[32m2023-06-22 12:00:41.016\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mdehb.optimizers.dehb\u001b[0m:\u001b[36mreset\u001b[0m:\u001b[36m107\u001b[0m - \u001b[1m\n", + "\u001b[32m2023-10-22 20:04:30.731\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mdehb.optimizers.dehb\u001b[0m:\u001b[36mreset\u001b[0m:\u001b[36m121\u001b[0m - \u001b[1m\n", "\n", - "RESET at 06/22/23 12:00:40 CEST\n", + "RESET at 10/22/23 20:04:30 CEST\n", "\n", "\u001b[0m\n", - "\u001b[32m2023-06-22 12:00:51.085\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mdehb.optimizers.dehb\u001b[0m:\u001b[36mreset\u001b[0m:\u001b[36m107\u001b[0m - \u001b[1m\n", + "\u001b[32m2023-10-22 20:04:41.051\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mdehb.optimizers.dehb\u001b[0m:\u001b[36mreset\u001b[0m:\u001b[36m121\u001b[0m - \u001b[1m\n", "\n", - "RESET at 06/22/23 12:00:51 CEST\n", + "RESET at 10/22/23 20:04:41 CEST\n", "\n", "\u001b[0m\n", - "\u001b[32m2023-06-22 12:01:01.182\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mdehb.optimizers.dehb\u001b[0m:\u001b[36mreset\u001b[0m:\u001b[36m107\u001b[0m - \u001b[1m\n", + "\u001b[32m2023-10-22 20:04:51.128\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mdehb.optimizers.dehb\u001b[0m:\u001b[36mreset\u001b[0m:\u001b[36m121\u001b[0m - \u001b[1m\n", "\n", - "RESET at 06/22/23 12:01:01 CEST\n", + "RESET at 10/22/23 20:04:51 CEST\n", "\n", "\u001b[0m\n", - "\u001b[32m2023-06-22 12:01:11.238\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mdehb.optimizers.dehb\u001b[0m:\u001b[36mreset\u001b[0m:\u001b[36m107\u001b[0m - \u001b[1m\n", + "\u001b[32m2023-10-22 20:05:01.200\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mdehb.optimizers.dehb\u001b[0m:\u001b[36mreset\u001b[0m:\u001b[36m121\u001b[0m - \u001b[1m\n", "\n", - "RESET at 06/22/23 12:01:11 CEST\n", + "RESET at 10/22/23 20:05:01 CEST\n", "\n", "\u001b[0m\n", - "\u001b[32m2023-06-22 12:01:21.293\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mdehb.optimizers.dehb\u001b[0m:\u001b[36mreset\u001b[0m:\u001b[36m107\u001b[0m - \u001b[1m\n", + "\u001b[32m2023-10-22 20:05:11.273\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36mdehb.optimizers.dehb\u001b[0m:\u001b[36mreset\u001b[0m:\u001b[36m121\u001b[0m - \u001b[1m\n", "\n", - "RESET at 06/22/23 12:01:21 CEST\n", + "RESET at 10/22/23 20:05:11 CEST\n", "\n", "\u001b[0m\n" ] @@ -466,14 +465,14 @@ " train_y=train_y,\n", " valid_X=valid_X,\n", " valid_y=valid_y,\n", - " max_budget=dehb.max_budget\n", + " max_fidelity=dehb.max_fidelity\n", " )\n", " best_config = dehb.vector_to_configspace(dehb.inc_config)\n", " \n", " # Creating a model using the best configuration found\n", " model = RandomForestClassifier(\n", " **best_config.get_dictionary(),\n", - " n_estimators=int(max_budget),\n", + " n_estimators=int(max_fidelity),\n", " bootstrap=True,\n", " random_state=seed,\n", " )\n", @@ -516,44 +515,39 @@ "output_type": "stream", "text": [ "Configuration(values={\n", - " 'max_depth': 13,\n", - " 'max_features': 0.5412753369058052,\n", - " 'min_samples_leaf': 12,\n", - " 'min_samples_split': 14,\n", - "})\n", - " got an accuracy of 1.0 on the test set.\n", - "\n", - "Configuration(values={\n", - " 'max_depth': 6,\n", - " 'max_features': 0.6764411582074702,\n", + " 'max_depth': 7,\n", + " 'max_features': 0.7162350418245509,\n", " 'min_samples_leaf': 1,\n", - " 'min_samples_split': 27,\n", - "})\n", - " got an accuracy of 1.0 on the test set.\n", + " 'min_samples_split': 18,\n", + "}) got an accuracy of 1.0 on the test set.\n", "\n", "Configuration(values={\n", - " 'max_depth': 5,\n", - " 'max_features': 0.5862915814751853,\n", + " 'max_depth': 11,\n", + " 'max_features': 0.564056444856198,\n", " 'min_samples_leaf': 2,\n", - " 'min_samples_split': 22,\n", - "})\n", - " got an accuracy of 1.0 on the test set.\n", + " 'min_samples_split': 7,\n", + "}) got an accuracy of 1.0 on the test set.\n", "\n", "Configuration(values={\n", - " 'max_depth': 14,\n", - " 'max_features': 0.5346143393392929,\n", - " 'min_samples_leaf': 5,\n", - " 'min_samples_split': 9,\n", - "})\n", - " got an accuracy of 1.0 on the test set.\n", + " 'max_depth': 9,\n", + " 'max_features': 0.7477652209361112,\n", + " 'min_samples_leaf': 1,\n", + " 'min_samples_split': 7,\n", + "}) got an accuracy of 1.0 on the test set.\n", "\n", "Configuration(values={\n", - " 'max_depth': 4,\n", - " 'max_features': 0.5541455312635835,\n", + " 'max_depth': 9,\n", + " 'max_features': 0.6510861760309854,\n", " 'min_samples_leaf': 4,\n", - " 'min_samples_split': 10,\n", - "})\n", - " got an accuracy of 1.0 on the test set.\n", + " 'min_samples_split': 24,\n", + "}) got an accuracy of 1.0 on the test set.\n", + "\n", + "Configuration(values={\n", + " 'max_depth': 6,\n", + " 'max_features': 0.5989756936409275,\n", + " 'min_samples_leaf': 2,\n", + " 'min_samples_split': 4,\n", + "}) got an accuracy of 1.0 on the test set.\n", "\n" ] } @@ -563,6 +557,13 @@ " print(\"{} got an accuracy of {} on the test set.\".format(config, score))\n", " print()" ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { diff --git a/examples/02_using DEHB_without_ConfigSpace.ipynb b/examples/02_using DEHB_without_ConfigSpace.ipynb index 11264c6..79c987d 100644 --- a/examples/02_using DEHB_without_ConfigSpace.ipynb +++ b/examples/02_using DEHB_without_ConfigSpace.ipynb @@ -61,7 +61,7 @@ "dimensions = len(param_space)\n", "\n", "# Declaring the fidelity range\n", - "min_budget, max_budget = 2, 50\n", + "min_fidelity, max_fidelity = 2, 50\n", "\n", "\n", "def transform_space(param_space, configuration):\n", @@ -164,27 +164,27 @@ " return train_X, train_y, valid_X, valid_y, test_X, test_y, dataset\n", "\n", "\n", - "def target_function(config, budget, **kwargs):\n", + "def target_function(config, fidelity, **kwargs):\n", " # Extracting support information\n", " seed = kwargs[\"seed\"]\n", " train_X = kwargs[\"train_X\"]\n", " train_y = kwargs[\"train_y\"]\n", " valid_X = kwargs[\"valid_X\"]\n", " valid_y = kwargs[\"valid_y\"]\n", - " max_budget = kwargs[\"max_budget\"]\n", + " max_fidelity = kwargs[\"max_fidelity\"]\n", " \n", " # Mapping [0, 1]-vector to Sklearn parameters\n", " param_space = kwargs[\"param_space\"]\n", " config = transform_space(param_space, config)\n", " \n", - " if budget is None:\n", - " budget = max_budget\n", + " if fidelity is None:\n", + " fidelity = max_fidelity\n", " \n", " start = time.time()\n", " # Building model \n", " model = RandomForestClassifier(\n", " **config,\n", - " n_estimators=int(budget),\n", + " n_estimators=int(fidelity),\n", " bootstrap=True,\n", " random_state=seed,\n", " )\n", @@ -203,7 +203,7 @@ " \"cost\": cost,\n", " \"info\": {\n", " \"test_score\": test_accuracy,\n", - " \"budget\": budget\n", + " \"fidelity\": fidelity\n", " }\n", " }\n", " return result\n", @@ -238,8 +238,8 @@ "dehb = DEHB(\n", " f=target_function, \n", " dimensions=dimensions, \n", - " min_budget=min_budget, \n", - " max_budget=max_budget,\n", + " min_fidelity=min_fidelity, \n", + " max_fidelity=max_fidelity,\n", " n_workers=1,\n", " output_path=\"./temp\"\n", ")" @@ -260,7 +260,7 @@ " train_y=train_y,\n", " valid_X=valid_X,\n", " valid_y=valid_y,\n", - " max_budget=dehb.max_budget,\n", + " max_fidelity=dehb.max_fidelity,\n", " param_space=param_space\n", ")" ] @@ -274,9 +274,9 @@ "name": "stdout", "output_type": "stream", "text": [ - "Incumbent score: -0.9685185185185186\n", + "Incumbent score: -0.9611111111111111\n", "Incumbent configuration:\n", - "{'max_depth': 10, 'min_samples_split': 3, 'max_features': 0.24012458257841524, 'min_samples_leaf': 2}\n" + "{'max_depth': 9, 'min_samples_split': 3, 'max_features': 0.3990411414400532, 'min_samples_leaf': 1}\n" ] } ], @@ -301,14 +301,14 @@ "name": "stdout", "output_type": "stream", "text": [ - "Test accuracy: 1.0\n" + "Test accuracy: 0.9944444444444445\n" ] } ], "source": [ "model = RandomForestClassifier(\n", " **transform_space(param_space, dehb.inc_config),\n", - " n_estimators=int(max_budget),\n", + " n_estimators=int(max_fidelity),\n", " bootstrap=True,\n", " random_state=seed,\n", ")\n", @@ -334,14 +334,12 @@ "outputs": [ { "data": { - "image/png": "\n", + "image/png": "", "text/plain": [ - "
" + "
" ] }, - "metadata": { - "needs_background": "light" - }, + "metadata": {}, "output_type": "display_data" } ], @@ -356,9 +354,9 @@ ], "metadata": { "kernelspec": { - "display_name": "dask", + "display_name": "Python 3 (ipykernel)", "language": "python", - "name": "dask" + "name": "python3" }, "language_info": { "codemirror_mode": { @@ -370,7 +368,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.9" + "version": "3.9.16" } }, "nbformat": 4, diff --git a/examples/03_pytorch_mnist_hpo.py b/examples/03_pytorch_mnist_hpo.py index c7eef27..14fba45 100644 --- a/examples/03_pytorch_mnist_hpo.py +++ b/examples/03_pytorch_mnist_hpo.py @@ -6,7 +6,7 @@ this space can be passed to an object of class Model() which can instantiate a CNN architecture from it. The objective_function() is the target function that DEHB minimizes for this problem. This function instantiates an architecture, an optimizer, as defined by a configuration and performs the -training and evaluation (on the validation set) as per the budget passed. +training and evaluation (on the validation set) as per the fidelity passed. The argument `runtime` can be passed to DEHB as a wallclock budget for running the optimisation. This tutorial also briefly refers to the different methods of interfacing DEHB with the Dask @@ -167,7 +167,7 @@ def evaluate(model, device, data_loader, acc=False): return loss -def train_and_evaluate(config, max_budget, verbose=False, **kwargs): +def train_and_evaluate(config, max_fidelity, verbose=False, **kwargs): device = kwargs["device"] batch_size = config["batch_size"] train_set = kwargs["train_set"] @@ -176,7 +176,7 @@ def train_and_evaluate(config, max_budget, verbose=False, **kwargs): test_loader = torch.utils.data.DataLoader(test_set, batch_size=batch_size, shuffle=False) model = Model(config).to(device) optimizer = optim.Adadelta(model.parameters(), lr=config["lr"]) - for epoch in range(1, int(max_budget)+1): + for epoch in range(1, int(max_fidelity)+1): train(model, device, train_loader, optimizer) accuracy = evaluate(model, device, test_loader, acc=True) if verbose: @@ -184,7 +184,7 @@ def train_and_evaluate(config, max_budget, verbose=False, **kwargs): return accuracy -def objective_function(config, budget, **kwargs): +def objective_function(config, fidelity, **kwargs): """ The target function to minimize for HPO""" device = kwargs["device"] @@ -204,7 +204,7 @@ def objective_function(config, budget, **kwargs): optimizer = optim.Adadelta(model.parameters(), lr=config["lr"]) start = time.time() # measuring wallclock time - for epoch in range(1, int(budget)+1): + for epoch in range(1, int(fidelity)+1): train(model, device, train_loader, optimizer) loss = evaluate(model, device, valid_loader) cost = time.time() - start @@ -216,7 +216,7 @@ def objective_function(config, budget, **kwargs): res = { "fitness": loss, "cost": cost, - "info": {"test_loss": test_loss, "budget": budget} + "info": {"test_loss": test_loss, "fidelity": fidelity} } return res @@ -228,11 +228,11 @@ def input_arguments(): parser.add_argument('--seed', type=int, default=123, metavar='S', help='random seed (default: 123)') parser.add_argument('--refit_training', action='store_true', default=False, - help='Refit with incumbent configuration on full training data and budget') - parser.add_argument('--min_budget', type=float, default=None, - help='Minimum budget (epoch length)') - parser.add_argument('--max_budget', type=float, default=None, - help='Maximum budget (epoch length)') + help='Refit with incumbent configuration on full training data and fidelity') + parser.add_argument('--min_fidelity', type=float, default=None, + help='Minimum fidelity (epoch length)') + parser.add_argument('--max_fidelity', type=float, default=None, + help='Maximum fidelity (epoch length)') parser.add_argument('--eta', type=int, default=3, help='Parameter for Hyperband controlling early stopping aggressiveness') parser.add_argument('--output_path', type=str, default="./pytorch_mnist_dehb", @@ -250,7 +250,7 @@ def input_arguments(): parser.add_argument('--verbose', action="store_true", default=False, help='Decides verbosity of DEHB optimization') parser.add_argument('--runtime', type=float, default=300, - help='Total time in seconds as budget to run DEHB') + help='Total time in seconds as fidelity to run DEHB') args = parser.parse_args() return args @@ -300,8 +300,8 @@ def main(): # DEHB optimisation block # ########################### np.random.seed(args.seed) - dehb = DEHB(f=objective_function, cs=cs, dimensions=dimensions, min_budget=args.min_budget, - max_budget=args.max_budget, eta=args.eta, output_path=args.output_path, + dehb = DEHB(f=objective_function, cs=cs, dimensions=dimensions, min_fidelity=args.min_fidelity, + max_fidelity=args.max_fidelity, eta=args.eta, output_path=args.output_path, # if client is not None and of type Client, n_workers is ignored # if client is None, a Dask client with n_workers is set up client=client, n_workers=args.n_workers) @@ -325,7 +325,7 @@ def main(): root='./data', train=True, download=True, transform=transform ) incumbent = dehb.vector_to_configspace(dehb.inc_config) - acc = train_and_evaluate(incumbent, args.max_budget, verbose=True, + acc = train_and_evaluate(incumbent, args.max_fidelity, verbose=True, train_set=train_set, test_set=test_set, device=device) dehb.logger.info("Test accuracy of {:.3f} for the best found configuration: ".format(acc)) dehb.logger.info(incumbent) diff --git a/src/dehb/optimizers/de.py b/src/dehb/optimizers/de.py index d1227f5..d1c40a2 100644 --- a/src/dehb/optimizers/de.py +++ b/src/dehb/optimizers/de.py @@ -1,17 +1,20 @@ import os -import numpy as np +from typing import List + import ConfigSpace import ConfigSpace.util -from typing import List +import numpy as np from distributed import Client +from ..utils import ConfigRepository + class DEBase(): '''Base class for Differential Evolution ''' def __init__(self, cs=None, f=None, dimensions=None, pop_size=None, max_age=None, - mutation_factor=None, crossover_prob=None, strategy=None, budget=None, - boundary_fix_type='random', **kwargs): + mutation_factor=None, crossover_prob=None, strategy=None, + boundary_fix_type='random', config_repository=None, **kwargs): # Benchmark related variables self.cs = cs self.f = f @@ -26,7 +29,6 @@ def __init__(self, cs=None, f=None, dimensions=None, pop_size=None, max_age=None self.mutation_factor = mutation_factor self.crossover_prob = crossover_prob self.strategy = strategy - self.budget = budget self.fix_type = boundary_fix_type # Miscellaneous @@ -39,18 +41,28 @@ def __init__(self, cs=None, f=None, dimensions=None, pop_size=None, max_age=None self.output_path = kwargs['output_path'] if 'output_path' in kwargs else './' os.makedirs(self.output_path, exist_ok=True) + if config_repository: + self.config_repository = config_repository + else: + self.config_repository = ConfigRepository() + # Global trackers - self.inc_score = np.inf - self.inc_config = None - self.population = None - self.fitness = None - self.age = None - self.history = [] + self.inc_score : float + self.inc_config : np.ndarray[float] + self.inc_id : int + self.population : np.ndarray[np.ndarray[float]] + self.population_ids :np.ndarray[int] + self.fitness : np.ndarray[float] + self.age : int + self.history : list[object] + self.reset() def reset(self): self.inc_score = np.inf self.inc_config = None + self.inc_id = -1 self.population = None + self.population_ids = None self.fitness = None self.age = None self.history = [] @@ -95,6 +107,7 @@ def init_population(self, pop_size: int) -> List: else: # if no ConfigSpace representation available, uniformly sample from [0, 1] population = np.random.uniform(low=0.0, high=1.0, size=(pop_size, self.dimensions)) + return np.array(population) def sample_population(self, size: int = 3, alt_pop: List = None) -> List: @@ -118,7 +131,7 @@ def sample_population(self, size: int = 3, alt_pop: List = None) -> List: selection = np.random.choice(np.arange(len(self.population)), size, replace=False) return self.population[selection] - def boundary_check(self, vector: np.array) -> np.array: + def boundary_check(self, vector: np.ndarray) -> np.ndarray: ''' Checks whether each of the dimensions of the input vector are within [0, 1]. If not, values of those dimensions are replaced with the type of fix selected. @@ -143,7 +156,7 @@ def boundary_check(self, vector: np.array) -> np.array: vector[violations] = np.clip(vector[violations], a_min=0, a_max=1) return vector - def vector_to_configspace(self, vector: np.array) -> ConfigSpace.Configuration: + def vector_to_configspace(self, vector: np.ndarray) -> ConfigSpace.Configuration: '''Converts numpy array to ConfigSpace object Works when self.cs is a ConfigSpace object and the input vector is in the domain [0, 1]. @@ -181,7 +194,7 @@ def vector_to_configspace(self, vector: np.array) -> ConfigSpace.Configuration: ) return new_config - def configspace_to_vector(self, config: ConfigSpace.Configuration) -> np.array: + def configspace_to_vector(self, config: ConfigSpace.Configuration) -> np.ndarray: '''Converts ConfigSpace object to numpy array scaled to [0,1] Works when self.cs is a ConfigSpace object and the input config is a ConfigSpace object. @@ -231,10 +244,11 @@ def run(self): class DE(DEBase): def __init__(self, cs=None, f=None, dimensions=None, pop_size=20, max_age=np.inf, mutation_factor=None, crossover_prob=None, strategy='rand1_bin', - budget=None, encoding=False, dim_map=None, **kwargs): + encoding=False, dim_map=None, config_repository=None, **kwargs): super().__init__(cs=cs, f=f, dimensions=dimensions, pop_size=pop_size, max_age=max_age, mutation_factor=mutation_factor, crossover_prob=crossover_prob, - strategy=strategy, budget=budget, **kwargs) + strategy=strategy, config_repository=config_repository, + **kwargs) if self.strategy is not None: self.mutation_strategy = self.strategy.split('_')[0] self.crossover_strategy = self.strategy.split('_')[1] @@ -285,7 +299,7 @@ def map_to_original(self, vector): new_vector[i] = np.max(np.array(vector)[self.dim_map[i]]) return new_vector - def f_objective(self, x, budget=None, **kwargs): + def f_objective(self, x, fidelity=None, **kwargs): if self.f is None: raise NotImplementedError("An objective function needs to be passed.") if self.encoding: @@ -296,18 +310,19 @@ def f_objective(self, x, budget=None, **kwargs): else: # can insert custom scaling/transform function here config = x.copy() - if budget is not None: # to be used when called by multi-fidelity based optimizers - res = self.f(config, budget=budget, **kwargs) + if fidelity is not None: # to be used when called by multi-fidelity based optimizers + res = self.f(config, fidelity=fidelity, **kwargs) else: res = self.f(config, **kwargs) assert "fitness" in res assert "cost" in res return res - def init_eval_pop(self, budget=None, eval=True, **kwargs): + def init_eval_pop(self, fidelity=None, eval=True, **kwargs): '''Creates new population of 'pop_size' and evaluates individuals. ''' self.population = self.init_population(self.pop_size) + self.population_ids = self.config_repository.announce_population(self.population, fidelity) self.fitness = np.array([np.inf for i in range(self.pop_size)]) self.age = np.array([self.max_age] * self.pop_size) @@ -320,25 +335,29 @@ def init_eval_pop(self, budget=None, eval=True, **kwargs): for i in range(self.pop_size): config = self.population[i] - res = self.f_objective(config, budget, **kwargs) + config_id = self.population_ids[i] + res = self.f_objective(config, fidelity, **kwargs) self.fitness[i], cost = res["fitness"], res["cost"] info = res["info"] if "info" in res else dict() if self.fitness[i] < self.inc_score: self.inc_score = self.fitness[i] self.inc_config = config + self.inc_id = config_id + self.config_repository.tell_result(config_id, float(fidelity or 0), res["fitness"], res["cost"], info) traj.append(self.inc_score) runtime.append(cost) - history.append((config.tolist(), float(self.fitness[i]), float(budget or 0), info)) + history.append((config.tolist(), float(self.fitness[i]), float(fidelity or 0), info)) return traj, runtime, history - def eval_pop(self, population=None, budget=None, **kwargs): + def eval_pop(self, population=None, population_ids=None, fidelity=None, **kwargs): '''Evaluates a population If population=None, the current population's fitness will be evaluated If population!=None, this population will be evaluated ''' pop = self.population if population is None else population + pop_ids = self.population_ids if population_ids is None else population_ids pop_size = self.pop_size if population is None else len(pop) traj = [] runtime = [] @@ -347,7 +366,7 @@ def eval_pop(self, population=None, budget=None, **kwargs): costs = [] ages = [] for i in range(pop_size): - res = self.f_objective(pop[i], budget, **kwargs) + res = self.f_objective(pop[i], fidelity, **kwargs) fitness, cost = res["fitness"], res["cost"] info = res["info"] if "info" in res else dict() if population is None: @@ -355,9 +374,11 @@ def eval_pop(self, population=None, budget=None, **kwargs): if fitness <= self.inc_score: self.inc_score = fitness self.inc_config = pop[i] + self.inc_id = pop_ids[i] + self.config_repository.tell_result(pop_ids[i], float(fidelity or 0), info) traj.append(self.inc_score) runtime.append(cost) - history.append((pop[i].tolist(), float(fitness), float(budget or 0), info)) + history.append((pop[i].tolist(), float(fitness), float(fidelity or 0), info)) fitnesses.append(fitness) costs.append(cost) ages.append(self.max_age) @@ -463,7 +484,7 @@ def crossover(self, target, mutant): offspring = self.crossover_exp(target, mutant) return offspring - def selection(self, trials, budget=None, **kwargs): + def selection(self, trials, trial_ids, fidelity=None, **kwargs): '''Carries out a parent-offspring competition given a set of trial population ''' traj = [] @@ -471,13 +492,16 @@ def selection(self, trials, budget=None, **kwargs): history = [] for i in range(len(trials)): # evaluation of the newly created individuals - res = self.f_objective(trials[i], budget, **kwargs) + res = self.f_objective(trials[i], fidelity, **kwargs) fitness, cost = res["fitness"], res["cost"] info = res["info"] if "info" in res else dict() + # log result to config repo + self.config_repository.tell_result(trial_ids[i], float(fidelity or 0), fitness, cost, info) # selection -- competition between parent[i] -- child[i] ## equality is important for landscape exploration if fitness <= self.fitness[i]: self.population[i] = trials[i] + self.population_ids[i] = trial_ids[i] self.fitness[i] = fitness # resetting age since new individual in the population self.age[i] = self.max_age @@ -488,23 +512,28 @@ def selection(self, trials, budget=None, **kwargs): if self.fitness[i] < self.inc_score: self.inc_score = self.fitness[i] self.inc_config = self.population[i] + self.inc_id = self.population[i] traj.append(self.inc_score) runtime.append(cost) - history.append((trials[i].tolist(), float(fitness), float(budget or 0), info)) + history.append((trials[i].tolist(), float(fitness), float(fidelity or 0), info)) return traj, runtime, history - def evolve_generation(self, budget=None, best=None, alt_pop=None, **kwargs): + def evolve_generation(self, fidelity=None, best=None, alt_pop=None, **kwargs): '''Performs a complete DE evolution: mutation -> crossover -> selection ''' trials = [] + trial_ids = [] for j in range(self.pop_size): target = self.population[j] donor = self.mutation(current=target, best=best, alt_pop=alt_pop) trial = self.crossover(target, donor) trial = self.boundary_check(trial) + trial_id = self.config_repository.announce_config(trial, float(fidelity or 0)) trials.append(trial) + trial_ids.append(trial_id) trials = np.array(trials) - traj, runtime, history = self.selection(trials, budget, **kwargs) + trial_ids = np.array(trial_ids) + traj, runtime, history = self.selection(trials, trial_ids, fidelity, **kwargs) return traj, runtime, history def sample_mutants(self, size, population=None): @@ -525,20 +554,20 @@ def sample_mutants(self, size, population=None): return mutants - def run(self, generations=1, verbose=False, budget=None, reset=True, **kwargs): + def run(self, generations=1, verbose=False, fidelity=None, reset=True, **kwargs): # checking if a run exists if not hasattr(self, 'traj') or reset: self.reset() if verbose: print("Initializing and evaluating new population...") - self.traj, self.runtime, self.history = self.init_eval_pop(budget=budget, **kwargs) + self.traj, self.runtime, self.history = self.init_eval_pop(fidelity=fidelity, **kwargs) if verbose: print("Running evolutionary search...") for i in range(generations): if verbose: print("Generation {:<2}/{:<2} -- {:<0.7}".format(i+1, generations, self.inc_score)) - traj, runtime, history = self.evolve_generation(budget=budget, **kwargs) + traj, runtime, history = self.evolve_generation(fidelity=fidelity, **kwargs) self.traj.extend(traj) self.runtime.extend(runtime) self.history.extend(history) @@ -552,7 +581,7 @@ def run(self, generations=1, verbose=False, budget=None, reset=True, **kwargs): class AsyncDE(DE): def __init__(self, cs=None, f=None, dimensions=None, pop_size=None, max_age=np.inf, mutation_factor=None, crossover_prob=None, strategy='rand1_bin', - budget=None, async_strategy='immediate', **kwargs): + async_strategy='immediate', config_repository=None, **kwargs): '''Extends DE to be Asynchronous with variations Parameters @@ -571,7 +600,8 @@ def __init__(self, cs=None, f=None, dimensions=None, pop_size=None, max_age=np.i ''' super().__init__(cs=cs, f=f, dimensions=dimensions, pop_size=pop_size, max_age=max_age, mutation_factor=mutation_factor, crossover_prob=crossover_prob, - strategy=strategy, budget=budget, **kwargs) + strategy=strategy, config_repository=config_repository, + **kwargs) if self.strategy is not None: self.mutation_strategy = self.strategy.split('_')[0] self.crossover_strategy = self.strategy.split('_')[1] @@ -642,8 +672,9 @@ def _sample_population(self, size=3, alt_pop=None, target=None): selection = np.random.choice(np.arange(len(population)), size, replace=False) return population[selection] - def eval_pop(self, population=None, budget=None, **kwargs): + def eval_pop(self, population=None, population_ids=None, fidelity=None, **kwargs): pop = self.population if population is None else population + pop_ids = self.population_ids if population_ids is None else population_ids pop_size = self.pop_size if population is None else len(pop) traj = [] runtime = [] @@ -652,7 +683,7 @@ def eval_pop(self, population=None, budget=None, **kwargs): costs = [] ages = [] for i in range(pop_size): - res = self.f_objective(pop[i], budget, **kwargs) + res = self.f_objective(pop[i], fidelity, **kwargs) fitness, cost = res["fitness"], res["cost"] info = res["info"] if "info" in res else dict() if population is None: @@ -660,9 +691,11 @@ def eval_pop(self, population=None, budget=None, **kwargs): if fitness <= self.inc_score: self.inc_score = fitness self.inc_config = pop[i] + self.inc_id = pop_ids[i] + self.config_repository.tell_result(pop_ids[i], float(fidelity or 0), fitness, cost, info) traj.append(self.inc_score) runtime.append(cost) - history.append((pop[i].tolist(), float(fitness), float(budget or 0), info)) + history.append((pop[i].tolist(), float(fitness), float(fidelity or 0), info)) fitnesses.append(fitness) costs.append(cost) ages.append(self.max_age) @@ -723,40 +756,46 @@ def sample_mutants(self, size, population=None): return mutants - def evolve_generation(self, budget=None, best=None, alt_pop=None, **kwargs): + def evolve_generation(self, fidelity=None, best=None, alt_pop=None, **kwargs): '''Performs a complete DE evolution, mutation -> crossover -> selection ''' traj = [] runtime = [] history = [] - if self.async_strategy == 'deferred': + if self.async_strategy == "deferred": trials = [] + trial_ids = [] for j in range(self.pop_size): target = self.population[j] donor = self.mutation(current=target, best=best, alt_pop=alt_pop) trial = self.crossover(target, donor) trial = self.boundary_check(trial) + trial_id = self.config_repository.announce_config(trial, float(fidelity or 0)) trials.append(trial) + trial_ids.append(trial_id) # selection takes place on a separate trial population only after # one iteration through the population has taken place trials = np.array(trials) - traj, runtime, history = self.selection(trials, budget, **kwargs) + traj, runtime, history = self.selection(trials, trial_ids, fidelity, **kwargs) return traj, runtime, history - elif self.async_strategy == 'immediate': + elif self.async_strategy == "immediate": for i in range(self.pop_size): target = self.population[i] donor = self.mutation(current=target, best=best, alt_pop=alt_pop) trial = self.crossover(target, donor) trial = self.boundary_check(trial) + trial_id = self.config_repository.announce_config(trial, float(fidelity or 0)) # evaluating a single trial population for the i-th individual de_traj, de_runtime, de_history, fitnesses, costs = \ - self.eval_pop(trial.reshape(1, self.dimensions), budget=budget, **kwargs) + self.eval_pop(trial.reshape(1, self.dimensions), + np.array([trial_id]), fidelity=fidelity, **kwargs) # one-vs-one selection ## can replace the i-the population despite not completing one iteration if fitnesses[0] <= self.fitness[i]: self.population[i] = trial + self.population_ids[i] = trial_id self.fitness[i] = fitnesses[0] traj.extend(de_traj) runtime.extend(de_runtime) @@ -766,7 +805,7 @@ def evolve_generation(self, budget=None, best=None, alt_pop=None, **kwargs): else: # async_strategy == 'random' or async_strategy == 'worst': for count in range(self.pop_size): # choosing target individual - if self.async_strategy == 'random': + if self.async_strategy == "random": i = np.random.choice(np.arange(self.pop_size)) else: # async_strategy == 'worst' i = np.argsort(-self.fitness)[0] @@ -774,9 +813,11 @@ def evolve_generation(self, budget=None, best=None, alt_pop=None, **kwargs): mutant = self.mutation(current=target, best=best, alt_pop=alt_pop) trial = self.crossover(target, mutant) trial = self.boundary_check(trial) + trial_id = self.config_repository.announce_config(trial, float(fidelity or 0)) # evaluating a single trial population for the i-th individual de_traj, de_runtime, de_history, fitnesses, costs = \ - self.eval_pop(trial.reshape(1, self.dimensions), budget=budget, **kwargs) + self.eval_pop(trial.reshape(1, self.dimensions), np.array([trial_id]), + fidelity=fidelity, **kwargs) # one-vs-one selection ## can replace the i-the population despite not completing one iteration if fitnesses[0] <= self.fitness[i]: @@ -788,22 +829,21 @@ def evolve_generation(self, budget=None, best=None, alt_pop=None, **kwargs): return traj, runtime, history - def run(self, generations=1, verbose=False, budget=None, reset=True, **kwargs): + def run(self, generations=1, verbose=False, fidelity=None, reset=True, **kwargs): # checking if a run exists - if not hasattr(self, 'traj') or reset: + if not hasattr(self, "traj") or reset: self.reset() if verbose: print("Initializing and evaluating new population...") - self.traj, self.runtime, self.history = self.init_eval_pop(budget=budget, **kwargs) + self.traj, self.runtime, self.history = self.init_eval_pop(fidelity=fidelity, **kwargs) if verbose: print("Running evolutionary search...") for i in range(generations): if verbose: print("Generation {:<2}/{:<2} -- {:<0.7}".format(i+1, generations, self.inc_score)) - traj, runtime, history = self.evolve_generation( - budget=budget, best=self.inc_config, **kwargs - ) + traj, runtime, history = self.evolve_generation(fidelity=fidelity, + best=self.inc_config, **kwargs) self.traj.extend(traj) self.runtime.extend(runtime) self.history.extend(history) diff --git a/src/dehb/optimizers/dehb.py b/src/dehb/optimizers/dehb.py index b36e269..612b496 100644 --- a/src/dehb/optimizers/dehb.py +++ b/src/dehb/optimizers/dehb.py @@ -12,6 +12,7 @@ from .de import DE, AsyncDE from ..utils import SHBracketManager +from ..utils import ConfigRepository logger.configure(handlers=[{"sink": sys.stdout, "level": "INFO"}]) @@ -24,11 +25,12 @@ class DEHBBase: def __init__(self, cs=None, f=None, dimensions=None, mutation_factor=None, - crossover_prob=None, strategy=None, min_budget=None, - max_budget=None, eta=None, min_clip=None, max_clip=None, + crossover_prob=None, strategy=None, min_fidelity=None, + max_fidelity=None, eta=None, min_clip=None, max_clip=None, boundary_fix_type='random', max_age=np.inf, **kwargs): # Miscellaneous self._setup_logger(kwargs) + self.config_repository = ConfigRepository() # Benchmark related variables self.cs = cs @@ -60,11 +62,11 @@ def __init__(self, cs=None, f=None, dimensions=None, mutation_factor=None, } # Hyperband related variables - self.min_budget = min_budget - self.max_budget = max_budget - if self.max_budget <= self.min_budget: - self.logger.error("Only (Max Budget > Min Budget) is supported for DEHB.") - if self.max_budget == self.min_budget: + self.min_fidelity = min_fidelity + self.max_fidelity = max_fidelity + if self.max_fidelity <= self.min_fidelity: + self.logger.error("Only (Max Fidelity > Min Fidelity) is supported for DEHB.") + if self.max_fidelity == self.min_fidelity: self.logger.error( "If you have a fixed fidelity, " \ "you can instead run DE. For more information checkout: " \ @@ -74,14 +76,14 @@ def __init__(self, cs=None, f=None, dimensions=None, mutation_factor=None, self.min_clip = min_clip self.max_clip = max_clip - # Precomputing budget spacing and number of configurations for HB iterations + # Precomputing fidelity spacing and number of configurations for HB iterations self.max_SH_iter = None - self.budgets = None - if self.min_budget is not None and \ - self.max_budget is not None and \ + self.fidelities = None + if self.min_fidelity is not None and \ + self.max_fidelity is not None and \ self.eta is not None: - self.max_SH_iter = -int(np.log(self.min_budget / self.max_budget) / np.log(self.eta)) + 1 - self.budgets = self.max_budget * np.power(self.eta, + self.max_SH_iter = -int(np.log(self.min_fidelity / self.max_fidelity) / np.log(self.eta)) + 1 + self.fidelities = self.max_fidelity * np.power(self.eta, -np.linspace(start=self.max_SH_iter - 1, stop=0, num=self.max_SH_iter)) @@ -124,7 +126,7 @@ def init_population(self): def get_next_iteration(self, iteration): '''Computes the Successive Halving spacing - Given the iteration index, computes the budget spacing to be used and + Given the iteration index, computes the fidelity spacing to be used and the number of configurations to be used for the SH iterations. Parameters @@ -137,12 +139,12 @@ def get_next_iteration(self, iteration): Returns ------- ns : array - budgets : array + fidelities : array ''' # number of 'SH runs' s = self.max_SH_iter - 1 - (iteration % self.max_SH_iter) - # budget spacing for this iteration - budgets = self.budgets[(-s-1):] + # fidelity spacing for this iteration + fidelities = self.fidelities[(-s-1):] # number of configurations in that bracket n0 = int(np.floor((self.max_SH_iter)/(s+1)) * self.eta**s) ns = [max(int(n0*(self.eta**(-i))), 1) for i in range(s+1)] @@ -151,7 +153,7 @@ def get_next_iteration(self, iteration): elif self.min_clip is not None: ns = np.clip(ns, a_min=self.min_clip, a_max=np.max(ns)) - return ns, budgets + return ns, fidelities def get_incumbents(self): """ Returns a tuple of the (incumbent configuration, incumbent score/fitness). """ @@ -168,13 +170,13 @@ def run(self): class DEHB(DEHBBase): def __init__(self, cs=None, f=None, dimensions=None, mutation_factor=0.5, - crossover_prob=0.5, strategy='rand1_bin', min_budget=None, - max_budget=None, eta=3, min_clip=None, max_clip=None, configspace=True, + crossover_prob=0.5, strategy='rand1_bin', min_fidelity=None, + max_fidelity=None, eta=3, min_clip=None, max_clip=None, configspace=True, boundary_fix_type='random', max_age=np.inf, n_workers=None, client=None, async_strategy="immediate", **kwargs): super().__init__(cs=cs, f=f, dimensions=dimensions, mutation_factor=mutation_factor, - crossover_prob=crossover_prob, strategy=strategy, min_budget=min_budget, - max_budget=max_budget, eta=eta, min_clip=min_clip, max_clip=max_clip, + crossover_prob=crossover_prob, strategy=strategy, min_fidelity=min_fidelity, + max_fidelity=max_fidelity, eta=eta, min_clip=min_clip, max_clip=max_clip, configspace=configspace, boundary_fix_type=boundary_fix_type, max_age=max_age, **kwargs) self.de_params.update({"async_strategy": async_strategy}) @@ -236,19 +238,21 @@ def _f_objective(self, job_info): # reprioritising a CUDA device order specific to this worker process os.environ.update({"CUDA_VISIBLE_DEVICES": job_info["gpu_devices"]}) - config, budget, parent_id = job_info['config'], job_info['budget'], job_info['parent_id'] - bracket_id = job_info['bracket_id'] + config, config_id = job_info["config"], job_info["config_id"] + fidelity, parent_id = job_info["fidelity"], job_info["parent_id"] + bracket_id = job_info["bracket_id"] kwargs = job_info["kwargs"] - res = self.de[budget].f_objective(config, budget, **kwargs) - info = res["info"] if "info" in res else dict() + res = self.de[fidelity].f_objective(config, fidelity, **kwargs) + info = res["info"] if "info" in res else {} run_info = { - 'fitness': res["fitness"], - 'cost': res["cost"], - 'config': config, - 'budget': budget, - 'parent_id': parent_id, - 'bracket_id': bracket_id, - 'info': info + "fitness": res["fitness"], + "cost": res["cost"], + "config": config, + "config_id": config_id, + "fidelity": fidelity, + "parent_id": parent_id, + "bracket_id": bracket_id, + "info": info, } if "gpu_devices" in job_info: @@ -295,13 +299,13 @@ def distribute_gpus(self): def vector_to_configspace(self, config): assert hasattr(self, "de") - assert len(self.budgets) > 0 - return self.de[self.budgets[0]].vector_to_configspace(config) + assert len(self.fidelities) > 0 + return self.de[self.fidelities[0]].vector_to_configspace(config) def configspace_to_vector(self, config): assert hasattr(self, "de") - assert len(self.budgets) > 0 - return self.de[self.budgets[0]].configspace_to_vector(config) + assert len(self.fidelities) > 0 + return self.de[self.fidelities[0]].configspace_to_vector(config) def reset(self): super().reset() @@ -353,7 +357,7 @@ def _update_incumbents(self, config, score, info): self.inc_info = info def _get_pop_sizes(self): - """Determines maximum pop size for each budget + """Determines maximum pop size for each fidelity """ self._max_pop_size = {} for i in range(self.max_SH_iter): @@ -364,27 +368,30 @@ def _get_pop_sizes(self): ) if r_j in self._max_pop_size.keys() else n[j] def _init_subpop(self): - """ List of DE objects corresponding to the budgets (fidelities) + """ List of DE objects corresponding to the fidelities """ self.de = {} - for i, b in enumerate(self._max_pop_size.keys()): - self.de[b] = AsyncDE(**self.de_params, budget=b, pop_size=self._max_pop_size[b]) - self.de[b].population = self.de[b].init_population(pop_size=self._max_pop_size[b]) - self.de[b].fitness = np.array([np.inf] * self._max_pop_size[b]) + for i, f in enumerate(self._max_pop_size.keys()): + self.de[f] = AsyncDE(**self.de_params, pop_size=self._max_pop_size[f], + config_repository=self.config_repository) + self.de[f].population = self.de[f].init_population(pop_size=self._max_pop_size[f]) + self.de[f].population_ids = self.config_repository.announce_population(self.de[f].population, f) + self.de[f].fitness = np.array([np.inf] * self._max_pop_size[f]) # adding attributes to DEHB objects to allow communication across subpopulations - self.de[b].parent_counter = 0 - self.de[b].promotion_pop = None - self.de[b].promotion_fitness = None + self.de[f].parent_counter = 0 + self.de[f].promotion_pop = None + self.de[f].promotion_pop_ids = None + self.de[f].promotion_fitness = None - def _concat_pops(self, exclude_budget=None): + def _concat_pops(self, exclude_fidelity=None): """ Concatenates all subpopulations """ - budgets = list(self.budgets) - if exclude_budget is not None: - budgets.remove(exclude_budget) + fidelities = list(self.fidelities) + if exclude_fidelity is not None: + fidelities.remove(exclude_fidelity) pop = [] - for budget in budgets: - pop.extend(self.de[budget].population.tolist()) + for fidelity in fidelities: + pop.extend(self.de[fidelity].population.tolist()) return np.array(pop) def _start_new_bracket(self): @@ -392,9 +399,9 @@ def _start_new_bracket(self): """ # start new bracket self.iteration_counter += 1 # iteration counter gives the bracket count or bracket ID - n_configs, budgets = self.get_next_iteration(self.iteration_counter) + n_configs, fidelities = self.get_next_iteration(self.iteration_counter) bracket = SHBracketManager( - n_configs=n_configs, budgets=budgets, bracket_id=self.iteration_counter + n_configs=n_configs, fidelities=fidelities, bracket_id=self.iteration_counter ) self.active_brackets.append(bracket) return bracket @@ -420,109 +427,122 @@ def is_worker_available(self, verbose=False): return False return True - def _get_promotion_candidate(self, low_budget, high_budget, n_configs): - """ Manages the population to be promoted from the lower to the higher budget. + def _get_promotion_candidate(self, low_fidelity, high_fidelity, n_configs): + """ Manages the population to be promoted from the lower to the higher fidelity. This is triggered or in action only during the first full HB bracket, which is equivalent to the number of brackets <= max_SH_iter. """ # finding the individuals that have been evaluated (fitness < np.inf) - evaluated_configs = np.where(self.de[low_budget].fitness != np.inf)[0] - promotion_candidate_pop = self.de[low_budget].population[evaluated_configs] - promotion_candidate_fitness = self.de[low_budget].fitness[evaluated_configs] + evaluated_configs = np.where(self.de[low_fidelity].fitness != np.inf)[0] + promotion_candidate_pop = self.de[low_fidelity].population[evaluated_configs] + promotion_candidate_pop_ids = self.de[low_fidelity].population_ids[evaluated_configs] + promotion_candidate_fitness = self.de[low_fidelity].fitness[evaluated_configs] # ordering the evaluated individuals based on their fitness values pop_idx = np.argsort(promotion_candidate_fitness) # creating population for promotion if none promoted yet or nothing to promote - if self.de[high_budget].promotion_pop is None or \ - len(self.de[high_budget].promotion_pop) == 0: - self.de[high_budget].promotion_pop = np.empty((0, self.dimensions)) - self.de[high_budget].promotion_fitness = np.array([]) - - # iterating over the evaluated individuals from the lower budget and including them - # in the promotion population for the higher budget only if it's not in the population + if self.de[high_fidelity].promotion_pop is None or \ + len(self.de[high_fidelity].promotion_pop) == 0: + self.de[high_fidelity].promotion_pop = np.empty((0, self.dimensions)) + self.de[high_fidelity].promotion_pop_ids = np.array([], dtype=np.int64) + self.de[high_fidelity].promotion_fitness = np.array([]) + + # iterating over the evaluated individuals from the lower fidelity and including them + # in the promotion population for the higher fidelity only if it's not in the population # this is done to ensure diversity of population and avoid redundant evaluations for idx in pop_idx: individual = promotion_candidate_pop[idx] - # checks if the candidate individual already exists in the high budget population - if np.any(np.all(individual == self.de[high_budget].population, axis=1)): + individual_id = promotion_candidate_pop_ids[idx] + # checks if the candidate individual already exists in the high fidelity population + if np.any(np.all(individual == self.de[high_fidelity].population, axis=1)): # skipping already present individual to allow diversity and reduce redundancy continue - self.de[high_budget].promotion_pop = np.append( - self.de[high_budget].promotion_pop, [individual], axis=0 + self.de[high_fidelity].promotion_pop = np.append( + self.de[high_fidelity].promotion_pop, [individual], axis=0 + ) + self.de[high_fidelity].promotion_pop_ids = np.append( + self.de[high_fidelity].promotion_pop_ids, individual_id ) - self.de[high_budget].promotion_fitness = np.append( - self.de[high_budget].promotion_pop, promotion_candidate_fitness[pop_idx] + self.de[high_fidelity].promotion_fitness = np.append( + self.de[high_fidelity].promotion_pop, promotion_candidate_fitness[pop_idx] ) # retaining only n_configs - self.de[high_budget].promotion_pop = self.de[high_budget].promotion_pop[:n_configs] - self.de[high_budget].promotion_fitness = \ - self.de[high_budget].promotion_fitness[:n_configs] - - if len(self.de[high_budget].promotion_pop) > 0: - config = self.de[high_budget].promotion_pop[0] + self.de[high_fidelity].promotion_pop = self.de[high_fidelity].promotion_pop[:n_configs] + self.de[high_fidelity].promotion_pop_ids = self.de[high_fidelity].promotion_pop_ids[:n_configs] + self.de[high_fidelity].promotion_fitness = \ + self.de[high_fidelity].promotion_fitness[:n_configs] + + if len(self.de[high_fidelity].promotion_pop) > 0: + config = self.de[high_fidelity].promotion_pop[0] + config_id = self.de[high_fidelity].promotion_pop_ids[0] # removing selected configuration from population - self.de[high_budget].promotion_pop = self.de[high_budget].promotion_pop[1:] - self.de[high_budget].promotion_fitness = self.de[high_budget].promotion_fitness[1:] + self.de[high_fidelity].promotion_pop = self.de[high_fidelity].promotion_pop[1:] + self.de[high_fidelity].promotion_pop_ids = self.de[high_fidelity].promotion_pop_ids[1:] + self.de[high_fidelity].promotion_fitness = self.de[high_fidelity].promotion_fitness[1:] else: - # in case of an edge failure case where all high budget individuals are same - # just choose the best performing individual from the lower budget (again) - config = self.de[low_budget].population[pop_idx[0]] - return config + # in case of an edge failure case where all high fidelity individuals are same + # just choose the best performing individual from the lower fidelity (again) + config = self.de[low_fidelity].population[pop_idx[0]] + config_id = self.de[low_fidelity].population_ids[pop_idx[0]] + return config, config_id - def _get_next_parent_for_subpop(self, budget): + def _get_next_parent_for_subpop(self, fidelity): """ Maintains a looping counter over a subpopulation, to iteratively select a parent """ - parent_id = self.de[budget].parent_counter - self.de[budget].parent_counter += 1 - self.de[budget].parent_counter = self.de[budget].parent_counter % self._max_pop_size[budget] + parent_id = self.de[fidelity].parent_counter + self.de[fidelity].parent_counter += 1 + self.de[fidelity].parent_counter = self.de[fidelity].parent_counter % self._max_pop_size[fidelity] return parent_id - def _acquire_config(self, bracket, budget): - """ Generates/chooses a configuration based on the budget and iteration number + def _acquire_config(self, bracket, fidelity): + """ Generates/chooses a configuration based on the fidelity and iteration number """ # select a parent/target - parent_id = self._get_next_parent_for_subpop(budget) - target = self.de[budget].population[parent_id] - # identify lower budget/fidelity to transfer information from - lower_budget, num_configs = bracket.get_lower_budget_promotions(budget) + parent_id = self._get_next_parent_for_subpop(fidelity) + target = self.de[fidelity].population[parent_id] + # identify lower fidelity to transfer information from + lower_fidelity, num_configs = bracket.get_lower_fidelity_promotions(fidelity) if self.iteration_counter < self.max_SH_iter: # promotions occur only in the first set of SH brackets under Hyperband - # for the first rung/budget in the current bracket, no promotion is possible and + # for the first rung/fidelity in the current bracket, no promotion is possible and # evolution can begin straight away - # for the subsequent rungs, individuals will be promoted from the lower_budget - if budget != bracket.budgets[0]: - # TODO: check if generalizes to all budget spacings - config = self._get_promotion_candidate(lower_budget, budget, num_configs) - return config, parent_id + # for the subsequent rungs, individuals will be promoted from the lower_fidelity + if fidelity != bracket.fidelities[0]: + # TODO: check if generalizes to all fidelity spacings + config, config_id = self._get_promotion_candidate(lower_fidelity, fidelity, num_configs) + return config, config_id, parent_id # DE evolution occurs when either all individuals in the subpopulation have been evaluated # at least once, i.e., has fitness < np.inf, which can happen if # iteration_counter <= max_SH_iter but certainly never when iteration_counter > max_SH_iter # a single DE evolution --- (mutation + crossover) occurs here - mutation_pop_idx = np.argsort(self.de[lower_budget].fitness)[:num_configs] - mutation_pop = self.de[lower_budget].population[mutation_pop_idx] - # generate mutants from previous budget subpopulation or global population - if len(mutation_pop) < self.de[budget]._min_pop_size: - filler = self.de[budget]._min_pop_size - len(mutation_pop) + 1 - new_pop = self.de[budget]._init_mutant_population( + mutation_pop_idx = np.argsort(self.de[lower_fidelity].fitness)[:num_configs] + mutation_pop = self.de[lower_fidelity].population[mutation_pop_idx] + # generate mutants from previous fidelity subpopulation or global population + if len(mutation_pop) < self.de[fidelity]._min_pop_size: + filler = self.de[fidelity]._min_pop_size - len(mutation_pop) + 1 + new_pop = self.de[fidelity]._init_mutant_population( pop_size=filler, population=self._concat_pops(), target=target, best=self.inc_config ) mutation_pop = np.concatenate((mutation_pop, new_pop)) # generate mutant from among individuals in mutation_pop - mutant = self.de[budget].mutation( + mutant = self.de[fidelity].mutation( current=target, best=self.inc_config, alt_pop=mutation_pop ) # perform crossover with selected parent - config = self.de[budget].crossover(target=target, mutant=mutant) - config = self.de[budget].boundary_check(config) - return config, parent_id + config = self.de[fidelity].crossover(target=target, mutant=mutant) + config = self.de[fidelity].boundary_check(config) + + # announce new config + config_id = self.config_repository.announce_config(config, fidelity) + return config, config_id, parent_id def _get_next_job(self): - """ Loads a configuration and budget to be evaluated next by a free worker + """ Loads a configuration and fidelity to be evaluated next by a free worker """ bracket = None if len(self.active_brackets) == 0 or \ @@ -541,13 +561,14 @@ def _get_next_job(self): if bracket is None: # start new bracket when existing list has all waiting brackets bracket = self._start_new_bracket() - # budget that the SH bracket allots - budget = bracket.get_next_job_budget() - config, parent_id = self._acquire_config(bracket, budget) - # notifies the Bracket Manager that a single config is to run for the budget chosen + # fidelity that the SH bracket allots + fidelity = bracket.get_next_job_fidelity() + config, config_id, parent_id = self._acquire_config(bracket, fidelity) + # notifies the Bracket Manager that a single config is to run for the fidelity chosen job_info = { "config": config, - "budget": budget, + "config_id": config_id, + "fidelity": fidelity, "parent_id": parent_id, "bracket_id": bracket.bracket_id } @@ -570,7 +591,7 @@ def _get_gpu_id_with_low_load(self): return gpu_ids def submit_job(self, job_info, **kwargs): - """ Asks a free worker to run the objective function on config and budget + """ Asks a free worker to run the objective function on config and fidelity """ job_info["kwargs"] = self.shared_data if self.shared_data is not None else kwargs # submit to to Dask client @@ -589,7 +610,7 @@ def submit_job(self, job_info, **kwargs): for bracket in self.active_brackets: if bracket.bracket_id == job_info['bracket_id']: # registering is IMPORTANT for Bracket Manager to perform SH - bracket.register_job(job_info['budget']) + bracket.register_job(job_info['fidelity']) break def _fetch_results_from_workers(self): @@ -618,29 +639,32 @@ def _fetch_results_from_workers(self): # update bracket information fitness, cost = run_info["fitness"], run_info["cost"] info = run_info["info"] if "info" in run_info else dict() - budget, parent_id = run_info["budget"], run_info["parent_id"] - config = run_info["config"] + fidelity, parent_id = run_info["fidelity"], run_info["parent_id"] + config, config_id = run_info["config"], run_info["config_id"] bracket_id = run_info["bracket_id"] for bracket in self.active_brackets: if bracket.bracket_id == bracket_id: # bracket job complete - bracket.complete_job(budget) # IMPORTANT to perform synchronous SH + bracket.complete_job(fidelity) # IMPORTANT to perform synchronous SH + + self.config_repository.tell_result(config_id, fidelity, fitness, cost, info) # carry out DE selection - if fitness <= self.de[budget].fitness[parent_id]: - self.de[budget].population[parent_id] = config - self.de[budget].fitness[parent_id] = fitness + if fitness <= self.de[fidelity].fitness[parent_id]: + self.de[fidelity].population[parent_id] = config + self.de[fidelity].population_ids[parent_id] = config_id + self.de[fidelity].fitness[parent_id] = fitness # updating incumbents - if self.de[budget].fitness[parent_id] < self.inc_score: + if self.de[fidelity].fitness[parent_id] < self.inc_score: self._update_incumbents( - config=self.de[budget].population[parent_id], - score=self.de[budget].fitness[parent_id], + config=self.de[fidelity].population[parent_id], + score=self.de[fidelity].fitness[parent_id], info=info ) # book-keeping self._update_trackers( traj=self.inc_score, runtime=cost, history=( - config.tolist(), float(fitness), float(cost), float(budget), info + config.tolist(), float(fitness), float(cost), float(fidelity), info ) ) # remove processed future @@ -727,7 +751,7 @@ def run(self, fevals=None, brackets=None, total_cost=None, single_node_with_gpus """ Main interface to run optimization by DEHB This function waits on workers and if a worker is free, asks for a configuration and a - budget to evaluate on and submits it to the worker. In each loop, it checks if a job + fidelity to evaluate on and submits it to the worker. In each loop, it checks if a job is complete, fetches the results, carries the necessary processing of it asynchronously to the worker computations. @@ -763,7 +787,7 @@ def run(self, fevals=None, brackets=None, total_cost=None, single_node_with_gpus break if self.is_worker_available(): job_info = self._get_next_job() - if brackets is not None and job_info['bracket_id'] >= brackets: + if brackets is not None and job_info["bracket_id"] >= brackets: # ignore submission and only collect results # when brackets are chosen as run budget, an extra bracket is created # since iteration_counter is incremented in _get_next_job() and then checked @@ -780,11 +804,12 @@ def run(self, fevals=None, brackets=None, total_cost=None, single_node_with_gpus # submits job_info to a worker for execution self.submit_job(job_info, **kwargs) if verbose: - budget = job_info['budget'] + fidelity = job_info["fidelity"] + config_id = job_info["config_id"] self._verbosity_runtime(fevals, brackets, total_cost) self.logger.info( - "Evaluating a configuration with budget {} under " - "bracket ID {}".format(budget, job_info['bracket_id']) + "Evaluating configuration {} with fidelity {} under " + "bracket ID {}".format(config_id, fidelity, job_info["bracket_id"]) ) self.logger.info( "Best score seen/Incumbent score: {}".format(self.inc_score) diff --git a/src/dehb/utils/__init__.py b/src/dehb/utils/__init__.py index dd9d4f0..65bbe00 100644 --- a/src/dehb/utils/__init__.py +++ b/src/dehb/utils/__init__.py @@ -1 +1,2 @@ -from .bracket_manager import SHBracketManager \ No newline at end of file +from .bracket_manager import SHBracketManager +from .config_repository import ConfigRepository \ No newline at end of file diff --git a/src/dehb/utils/bracket_manager.py b/src/dehb/utils/bracket_manager.py index 6f2079e..2642223 100644 --- a/src/dehb/utils/bracket_manager.py +++ b/src/dehb/utils/bracket_manager.py @@ -4,93 +4,93 @@ class SHBracketManager(object): """ Synchronous Successive Halving utilities """ - def __init__(self, n_configs, budgets, bracket_id=None): - assert len(n_configs) == len(budgets) + def __init__(self, n_configs, fidelities, bracket_id=None): + assert len(n_configs) == len(fidelities) self.n_configs = n_configs - self.budgets = budgets + self.fidelities = fidelities self.bracket_id = bracket_id self.sh_bracket = {} self._sh_bracket = {} self._config_map = {} - for i, budget in enumerate(budgets): + for i, fidelity in enumerate(fidelities): # sh_bracket keeps track of jobs/configs that are still to be scheduled/allocatted # _sh_bracket keeps track of jobs/configs that have been run and results retrieved for # (sh_bracket[i] + _sh_bracket[i]) == n_configs[i] is when no jobs have been scheduled - # or all jobs for that budget/rung are over + # or all jobs for that fidelity/rung are over # (sh_bracket[i] + _sh_bracket[i]) < n_configs[i] indicates a job has been scheduled # and is queued/running and the bracket needs to be paused till results are retrieved - self.sh_bracket[budget] = n_configs[i] # each scheduled job does -= 1 - self._sh_bracket[budget] = 0 # each retrieved job does +=1 - self.n_rungs = len(budgets) + self.sh_bracket[fidelity] = n_configs[i] # each scheduled job does -= 1 + self._sh_bracket[fidelity] = 0 # each retrieved job does +=1 + self.n_rungs = len(fidelities) self.current_rung = 0 - def get_budget(self, rung=None): - """ Returns the exact budget that rung is pointing to. + def get_fidelity(self, rung=None): + """ Returns the exact fidelity that rung is pointing to. - Returns current rung's budget if no rung is passed. + Returns current rung's fidelity if no rung is passed. """ if rung is not None: - return self.budgets[rung] - return self.budgets[self.current_rung] + return self.fidelities[rung] + return self.fidelities[self.current_rung] - def get_lower_budget_promotions(self, budget): - """ Returns the immediate lower budget and the number of configs to be promoted from there + def get_lower_fidelity_promotions(self, fidelity): + """ Returns the immediate lower fidelity and the number of configs to be promoted from there """ - assert budget in self.budgets - rung = np.where(budget == self.budgets)[0][0] + assert fidelity in self.fidelities + rung = np.where(fidelity == self.fidelities)[0][0] prev_rung = np.clip(rung - 1, a_min=0, a_max=self.n_rungs-1) - lower_budget = self.budgets[prev_rung] + lower_fidelity = self.fidelities[prev_rung] num_promote_configs = self.n_configs[rung] - return lower_budget, num_promote_configs + return lower_fidelity, num_promote_configs - def get_next_job_budget(self): - """ Returns the budget that will be selected if current_rung is incremented by 1 + def get_next_job_fidelity(self): + """ Returns the fidelity that will be selected if current_rung is incremented by 1 """ - if self.sh_bracket[self.get_budget()] > 0: + if self.sh_bracket[self.get_fidelity()] > 0: # the current rung still has unallocated jobs (>0) - return self.get_budget() + return self.get_fidelity() else: # the current rung has no more jobs to allocate, increment it rung = (self.current_rung + 1) % self.n_rungs - if self.sh_bracket[self.get_budget(rung)] > 0: + if self.sh_bracket[self.get_fidelity(rung)] > 0: # the incremented rung has unallocated jobs (>0) - return self.get_budget(rung) + return self.get_fidelity(rung) else: # all jobs for this bracket has been allocated/bracket is complete - # no more budgets to evaluate and can return None + # no more fidelities to evaluate and can return None pass return None - def register_job(self, budget): - """ Registers the allocation of a configuration for the budget and updates current rung + def register_job(self, fidelity): + """ Registers the allocation of a configuration for the fidelity and updates current rung This function must be called when scheduling a job in order to allow the bracket manager - to continue job and budget allocation without waiting for jobs to finish and return + to continue job and fidelity allocation without waiting for jobs to finish and return results necessarily. This feature can be leveraged to run brackets asynchronously. """ - assert budget in self.budgets - assert self.sh_bracket[budget] > 0 - self.sh_bracket[budget] -= 1 + assert fidelity in self.fidelities + assert self.sh_bracket[fidelity] > 0 + self.sh_bracket[fidelity] -= 1 if not self._is_rung_pending(self.current_rung): # increment current rung if no jobs left in the rung self.current_rung = (self.current_rung + 1) % self.n_rungs - def complete_job(self, budget): - """ Notifies the bracket that a job for a budget has been completed + def complete_job(self, fidelity): + """ Notifies the bracket that a job for a fidelity has been completed - This function must be called when a config for a budget has finished evaluation to inform + This function must be called when a config for a fidelity has finished evaluation to inform the Bracket Manager that no job needs to be waited for and the next rung can begin for the synchronous Successive Halving case. """ - assert budget in self.budgets - _max_configs = self.n_configs[list(self.budgets).index(budget)] - assert self._sh_bracket[budget] < _max_configs - self._sh_bracket[budget] += 1 + assert fidelity in self.fidelities + _max_configs = self.n_configs[list(self.fidelities).index(fidelity)] + assert self._sh_bracket[fidelity] < _max_configs + self._sh_bracket[fidelity] += 1 def _is_rung_waiting(self, rung): """ Returns True if at least one job is still pending/running and waits for results """ - job_count = self._sh_bracket[self.budgets[rung]] + self.sh_bracket[self.budgets[rung]] + job_count = self._sh_bracket[self.fidelities[rung]] + self.sh_bracket[self.fidelities[rung]] if job_count < self.n_configs[rung]: return True return False @@ -98,7 +98,7 @@ def _is_rung_waiting(self, rung): def _is_rung_pending(self, rung): """ Returns True if at least one job pending to be allocatted in the rung """ - if self.sh_bracket[self.budgets[rung]] > 0: + if self.sh_bracket[self.fidelities[rung]] > 0: return True return False @@ -116,33 +116,33 @@ def is_bracket_done(self): return ~self.is_pending() and ~self.is_waiting() def is_pending(self): - """ Returns True if any of the rungs/budgets have still a configuration to submit + """ Returns True if any of the rungs/fidelities have still a configuration to submit """ - return np.any([self._is_rung_pending(i) > 0 for i, _ in enumerate(self.budgets)]) + return np.any([self._is_rung_pending(i) > 0 for i, _ in enumerate(self.fidelities)]) def is_waiting(self): - """ Returns True if any of the rungs/budgets have a configuration pending/running + """ Returns True if any of the rungs/fidelities have a configuration pending/running """ - return np.any([self._is_rung_waiting(i) > 0 for i, _ in enumerate(self.budgets)]) + return np.any([self._is_rung_waiting(i) > 0 for i, _ in enumerate(self.fidelities)]) def __repr__(self): - cell_width = 9 + cell_width = 10 cell = "{{:^{}}}".format(cell_width) - budget_cell = "{{:^{}.2f}}".format(cell_width) + fidelity_cell = "{{:^{}.2f}}".format(cell_width) header = "|{}|{}|{}|{}|".format( - cell.format("budget"), + cell.format("fidelity"), cell.format("pending"), cell.format("waiting"), cell.format("done") ) _hline = "-" * len(header) table = [header, _hline] - for i, budget in enumerate(self.budgets): - pending = self.sh_bracket[budget] - done = self._sh_bracket[budget] + for i, fidelity in enumerate(self.fidelities): + pending = self.sh_bracket[fidelity] + done = self._sh_bracket[fidelity] waiting = np.abs(self.n_configs[i] - pending - done) entry = "|{}|{}|{}|{}|".format( - budget_cell.format(budget), + fidelity_cell.format(fidelity), cell.format(pending), cell.format(waiting), cell.format(done) diff --git a/src/dehb/utils/config_repository.py b/src/dehb/utils/config_repository.py new file mode 100644 index 0000000..126b28f --- /dev/null +++ b/src/dehb/utils/config_repository.py @@ -0,0 +1,127 @@ +from __future__ import annotations + +from dataclasses import dataclass +from typing import Any + +import numpy as np + + +@dataclass +class ConfigItem: + """Data class to store information regarding a specific configuration. + + The results for this configuration are stored in the `results` dict, using the fidelity it has + been evaluated on as keys. + """ + config_id: int + config: np.ndarray + results: dict[float, ResultItem] + +@dataclass +class ResultItem: + """Data class storing the result information of a specific configuration + fidelity.""" + score: float + cost: float + info: dict[Any, Any] + +class ConfigRepository: + """Bookkeeps all configurations used throughout the course of the optimization. + + Keeps track of the configurations and their results on the different fidelitites. + A new configuration is announced via `announce_config`. After evaluating the configuration + on the specified fidelity, use `tell_result` to log the achieved performance, cost etc. + + The configurations are stored in a list of `ConfigItem`. + """ + def __init__(self) -> None: + """Initializes the class by calling `self.reset`.""" + self.configs : list[ConfigItem] + self.reset() + + def reset(self) -> None: + """Resets the config repository, clearing all collected configurations and results.""" + self.configs = [] + + def announce_config(self, config: np.ndarray, fidelity=None) -> int: + """Announces a new configuration with the respective fidelity it should be evaluated on. + + The configuration is then added to the list of so far seen configurations and the ID of the + configuration is returned. + + Args: + config (np.ndarray): New configuration + fidelity (float, optional): Fidelity on which `config` is evaluated or None. + Defaults to None. + + Returns: + int: ID of configuration + """ + config_id = len(self.configs) + fidelity = float(fidelity or 0) + result_dict = { + fidelity: ResultItem(np.inf, -1, {}), + } + config_item = ConfigItem(config_id, config, result_dict) + self.configs.append(config_item) + return config_id + + def announce_population(self, population: np.ndarray, fidelity=None) -> np.ndarray: + """Announce population, retrieving ids for the population. + + Args: + population (np.ndarray): Population to announce + fidelity (float, optional): Fidelity on which pop is evaluated or None. + Defaults to None. + + Returns: + np.ndarray: population ids + """ + population_ids = [] + for indiv in population: + conf_id = self.announce_config(indiv, float(fidelity or 0)) + population_ids.append(conf_id) + return np.array(population_ids) + + def announce_fidelity(self, config_id: int, fidelity: float) -> bool: + """Announce the evaluation of a new fidelity for a given config. + + This function may only be used if the config already exists in the repository. + + Args: + config_id (int): ID of Configuration + fidelity (float): Fidelity on which the config will be evaluated + + Returns: + bool: Success/Failure of operation + """ + if config_id >= len(self.configs) or config_id < 0: + # TODO: Error message + return False + + config_item = self.configs[config_id] + result_item = { + fidelity: ResultItem(np.inf, -1, {}), + } + config_item.results[fidelity] = result_item + return True + + def tell_result(self, config_id: int, fidelity: float, score: float, cost: float, info: dict): + """Logs the achieved performance, cost etc. of a specific configuration-fidelity pair. + + Args: + config_id (int): ID of evaluated configuration + fidelity (float): Fidelity on which configuration has been evaluated. + score (float): Achieved score, given by objective function + cost (float): Cost, given by objective function + info (dict): Run info, given by objective function + """ + config_item = self.configs[config_id] + + # If configuration has been promoted, there is no fidelity information yet + if fidelity not in config_item.results: + config_item.results[fidelity] = ResultItem(score, cost, info) + else: + # ResultItem already given for specified fidelity --> update entries + config_item.results[fidelity].score = score + config_item.results[fidelity].cost = cost + config_item.results[fidelity].info = info \ No newline at end of file diff --git a/tests/test_config_repository.py b/tests/test_config_repository.py new file mode 100644 index 0000000..f63e870 --- /dev/null +++ b/tests/test_config_repository.py @@ -0,0 +1,34 @@ +import typing + +import numpy as np +from src.dehb.utils import ConfigRepository + + +class TestConfigAnnouncing(): + """Class that bundles all tests for announcing configurations to the repository.""" + def test_single_config(self): + """Tests announcing single config.""" + repo = ConfigRepository() + config = np.array([0.5]) + + config_id = repo.announce_config(config, 2) + + assert len(repo.configs) == 1 + assert config_id == 0 + assert repo.configs[config_id].config == config + + def test_population(self): + """Tests announcing a whole population.""" + repo = ConfigRepository() + pop = [] + for i in range(10): + config = np.array([i / 10]) + pop.append(config) + pop = np.array(pop) + + config_ids = repo.announce_population(pop) + + assert len(repo.configs) == 10 + + for conf_id in config_ids: + assert repo.configs[conf_id].config == pop[conf_id] \ No newline at end of file diff --git a/tests/test_de.py b/tests/test_de.py index f64457b..50099b2 100644 --- a/tests/test_de.py +++ b/tests/test_de.py @@ -15,7 +15,7 @@ def create_toy_DEBase(configspace: ConfigSpace.ConfigurationSpace): """ dim = len(configspace.get_hyperparameters()) return DEBase(f=lambda: 1, cs=configspace, dimensions=dim, pop_size=10, max_age=5, - mutation_factor=0.5, crossover_prob=0.5, strategy="rand1_bin", budget=1) + mutation_factor=0.5, crossover_prob=0.5, strategy="rand1_bin", fidelity=1) class TestConversion(): """Class that bundles all ConfigSpace/vector conversion tests. diff --git a/tests/test_dehb.py b/tests/test_dehb.py index 8a96e03..277e959 100644 --- a/tests/test_dehb.py +++ b/tests/test_dehb.py @@ -22,15 +22,15 @@ def create_toy_searchspace(): ConfigSpace.UniformFloatHyperparameter("x0", lower=3, upper=10, log=False)) return cs -def create_toy_optimizer(configspace: ConfigSpace.ConfigurationSpace, min_budget: float, - max_budget: float, eta: int, +def create_toy_optimizer(configspace: ConfigSpace.ConfigurationSpace, min_fidelity: float, + max_fidelity: float, eta: int, objective_function: typing.Callable): """Creates a DEHB instance. Args: configspace (ConfigurationSpace): Searchspace to use - min_budget (float): Minimum budget for DEHB - max_budget (float): Maximum budget for DEHB + min_fidelity (float): Minimum fidelity for DEHB + max_fidelity (float): Maximum fidelity for DEHB eta (int): Eta parameter of DEHB objective_function (Callable): Function to optimize @@ -39,16 +39,16 @@ def create_toy_optimizer(configspace: ConfigSpace.ConfigurationSpace, min_budget """ dim = len(configspace.get_hyperparameters()) return DEHB(f=objective_function, cs=configspace, dimensions=dim, - min_budget=min_budget, - max_budget=max_budget, eta=eta, n_workers=1) + min_fidelity=min_fidelity, + max_fidelity=max_fidelity, eta=eta, n_workers=1) -def objective_function(x: ConfigSpace.Configuration, budget: float, **kwargs): +def objective_function(x: ConfigSpace.Configuration, fidelity: float, **kwargs): """Toy objective function. Args: x (ConfigSpace.Configuration): Configuration to evaluate - budget (float): Budget to evaluate x on + fidelity (float): fidelity to evaluate x on Returns: dict: Result dictionary @@ -70,7 +70,7 @@ class TestBudgetExhaustion(): def test_runtime_exhaustion(self): """Test for runtime budget exhaustion.""" cs = create_toy_searchspace() - dehb = create_toy_optimizer(configspace=cs, min_budget=3, max_budget=27, eta=3, + dehb = create_toy_optimizer(configspace=cs, min_fidelity=3, max_fidelity=27, eta=3, objective_function=objective_function) dehb.start = time.time() - 10 @@ -80,7 +80,7 @@ def test_runtime_exhaustion(self): def test_fevals_exhaustion(self): """Test for function evaluations budget exhaustion.""" cs = create_toy_searchspace() - dehb = create_toy_optimizer(configspace=cs, min_budget=3, max_budget=27, eta=3, + dehb = create_toy_optimizer(configspace=cs, min_fidelity=3, max_fidelity=27, eta=3, objective_function=objective_function) dehb.traj.append("Just needed for the test") @@ -90,7 +90,7 @@ def test_fevals_exhaustion(self): def test_brackets_exhaustion(self): """Test for bracket budget exhaustion.""" cs = create_toy_searchspace() - dehb = create_toy_optimizer(configspace=cs, min_budget=3, max_budget=27, eta=3, + dehb = create_toy_optimizer(configspace=cs, min_fidelity=3, max_fidelity=27, eta=3, objective_function=objective_function) dehb.iteration_counter = 5 @@ -99,16 +99,52 @@ def test_brackets_exhaustion(self): class TestInitialization: """Class that bundles all tests regarding the initialization of DEHB.""" - def test_higher_min_budget(self): - """Test that verifies, that DEHB breaks if min_budget > max_budget.""" + def test_higher_min_fidelity(self): + """Test that verifies, that DEHB breaks if min_fidelity > max_fidelity.""" cs = create_toy_searchspace() with pytest.raises(AssertionError): - create_toy_optimizer(configspace=cs, min_budget=28, max_budget=27, eta=3, + create_toy_optimizer(configspace=cs, min_fidelity=28, max_fidelity=27, eta=3, objective_function=objective_function) - def test_equal_min_max_budget(self): - """Test that verifies, that DEHB breaks if min_budget == max_budget.""" + def test_equal_min_max_fidelity(self): + """Test that verifies, that DEHB breaks if min_fidelity == max_fidelity.""" cs = create_toy_searchspace() with pytest.raises(AssertionError): - create_toy_optimizer(configspace=cs, min_budget=27, max_budget=27, eta=3, + create_toy_optimizer(configspace=cs, min_fidelity=27, max_fidelity=27, eta=3, objective_function=objective_function) + +class TestConfigID: + """Class that bundles all tests regarding config ID functionality.""" + def test_initialization(self): + """Verifies, that the initial population is properly tracked by the config repository.""" + cs = create_toy_searchspace() + dehb = create_toy_optimizer(configspace=cs, min_fidelity=3, max_fidelity=27, eta=3, + objective_function=objective_function) + # calculate how many configurations have been sampled for the initial populations + num_configs = 0 + for de_inst in dehb.de.values(): + num_configs += len(de_inst.population) + + # config repository should be exactly this long + assert len(dehb.config_repository.configs) == num_configs + + def test_single_bracket(self): + """Verifies, that the population is continously tracked over the run of a single bracket.""" + cs = create_toy_searchspace() + dehb = create_toy_optimizer(configspace=cs, min_fidelity=3, max_fidelity=27, eta=3, + objective_function=objective_function) + # calculate how many configurations have been sampled for the initial populations + num_initial_configs = 0 + for de_inst in dehb.de.values(): + num_initial_configs += len(de_inst.population) + + # run for a single bracket + dehb.run(brackets=1, verbose=True) + + # for the first bracket, we only mutate on the lowest fidelity and then promote the best + # configs to the next fidelity. Please note, that this is only the case for the first + # DEHB bracket! + # Note: The final + 1 is due to the inner workings of DEHB. If the run budget is exhausted, + # we keep evolving new configurations without evaluating them, since we are only waiting to + # to fetch all results started ahead of the budget exhaustion. + assert len(dehb.config_repository.configs) == num_initial_configs + 9 + 1 \ No newline at end of file diff --git a/utils/README.md b/utils/README.md index 65f4058..5297e17 100644 --- a/utils/README.md +++ b/utils/README.md @@ -36,8 +36,8 @@ For example, running a DEHB optimization by specifiying `scheduler_file` makes t connect to the Dask cluster runnning. ```bash python examples/03_pytorch_mnist_hpo.py \ - --min_budget 1 \ - --max_budget 9 \ + --min_fidelity 1 \ + --max_fidelity 9 \ --runtime 200 \ --seed 123 \ --scheduler_file scheduler/scheduler_gpu.json \