DeepHyper Benchmark

Introduction

This repository is a collection of machine learning benchmark for DeepHyper.

Organization of the Repository

The repository follows this organization:

# Python package containing utility code
deephyper_benchmark/

# Library of benchmarks
lib/

Installation

To install the DeepHyper benchmark suite, run:

git clone https://github.com/deephyper/benchmark.git deephyper_benchmark
cd deephyper_benchmark/
pip install -e "."

Defining a Benchmark

A benchmark is defined as a sub-folder of the lib/ folder such as lib/Benchmark-101/. Then a benchmark folder needs to follow a python package structure and therefore it needs to contain a __init__.py file at its root. In addition, a benchmark folder needs to define a benchmark.py script that defines its requirements.

General benchmark structure:

lib/
    Benchmark-101/
        __init__.py
        benchmark.py
        data.py
        model.py
        hpo.py # Defines hyperparameter optimization inputs (run-function + problem)
        README.md # Description of the benchmark

Then to use the benchmark:

import deephyper_benchmark as dhb

dhb.install("Benchmark-101")

dhb.load("Benchmark-101")

from deephyper_benchmark.lib.benchmark_101.hpo import problem, run

All run-functions (i.e., functions returning the objective(s) to be optimized) should follow the MAXIMIZATION standard. If a benchmark needs minimization then the negative of the minimized objective can be returned return -minimized_objective.

A benchmark inherits from the Benchmark class:

import os

from deephyper_benchmark import *

DIR = os.path.dirname(os.path.abspath(__file__))


class Benchmark101(Benchmark):

    version = "0.0.1"

    requires = {
        "bash-install": {"type": "cmd", "cmd": "cd .. && " + os.path.join(DIR, "../install.sh")},
    }

Finally, when testing a benchmark it can be useful to activate the logging:

import logging

logging.basicConfig(
    # filename="deephyper.log", # Uncomment if you want to create a file with the logs
    level=logging.INFO,
    format="%(asctime)s - %(levelname)s - %(filename)s:%(funcName)s - %(message)s",
    force=True,
)

Configuration

Benchmarks can sometimes be configured. The configuration can use environment variables with the prefix DEEPHYPER_BENCHMARK_.

Standard Metadata

Benchmarks must return the following standard metadata when it applies, some metadata are specific to neural networks (e.g., num_parameters):

num_parameters: integer value of the number of parameters in the neural network.
num_parameters_train: integer value of the number of trainable parameters of the neural network.
budget: scalar value (float/int) of the budget consumed by the neural network. Therefore the budget should be defined for each benchmark (e.g., number of epochs in general).
stopped: boolean value indicating if the evaluation was stopped before consuming the maximum budget.
train_X: scalar value of the training metrics (replace X by the metric name, 1 key per metric).
valid_X: scalar value of the validation metrics (replace X by the metric name, 1 key per metric).
test_X: scalar value of the testing metrics (replace X by the metric name, 1 key per metric).
flops: number of flops of the model such as computed in fvcore.nn.FlopCountAnalysis(...).total() (See documentation).
latency: TO BE CLARIFIED
lc_train_X: recorded learning curves of the trained model, the bi variables are the budget value (e.g., epochs/batches), and the yi values are the recorded metric. X in train_X is replaced by the name of the metric such as train_loss or train_accuracy. The format is [[b0, y0], [b1, y1], ...].
lc_valid_X: Same as lc_train_X but for validation data.

The @profile decorator should be used on all run-functions to collect the timestamp_start and timestamp_end metadata.

List of Benchmarks

In the following table:

$\mathbb{R}$ denotes real parameters.
$\mathbb{D}$ denotes discrete parameters.
$\mathbb{C}$ denotes categorical parameters.

Name	Description	Variable(s) Type	Objective(s) Type	Multi-Objective	Multi-Fidelity	Evaluation Duration
C-BBO	Continuous Black-Box Optimization problems.	$\mathbb{R}^n$	$\mathbb{R}$	❌	❌	configurable
DTLZ	The modified DTLZ multiobjective test suite.	$\mathbb{R}^n$	$\mathbb{R}$	✅	❌	configurable
ECP-Candle	Deep Neural-Networks on multiple "biological" scales of Cancer related data.	$\mathbb{R}\times\mathbb{D}\times\mathbb{C}$	$\mathbb{R}$	✅	✅	min
HPOBench	Hyperparameter Optimization Benchmark.	$\mathbb{R}\times\mathbb{D}\times\mathbb{C}$	$\mathbb{R}$	✅	✅	ms to min
JAHSBench	A slightly modified JAHSBench 201 wrapper.	$\mathbb{R}^2\times\mathbb{D}\times\mathbb{C}^8$	$\mathbb{R}$	✅	❌	configurable
LCu	Learning curve hyperparameter optimization benchmark.
LCbench	Multi-fidelity benchmark without hyperparameter optimization.	NA	$\mathbb{R}$	❌	✅	secondes
PINNBench	Physics Informed Neural Networks Benchmark.	$\mathbb{R}\times\mathbb{D}\times\mathbb{C}$	$\mathbb{R}$	✅	✅	ms

List of Optimization Algorithm

COBYQA: deephyper_benchmark.search.COBYQA(...)
PyBOBYQA: deephyper_benchmark.search.PyBOBYQA(...)
TPE: deephyper_benchmark.search.MPIDistributedOptuna(..., sampler="TPE")
BoTorch: deephyper_benchmark.search.MPIDistributedOptuna(..., sampler="BOTORCH")
CMAES: deephyper_benchmark.search.MPIDistributedOptuna(..., sampler="CMAES")
NSGAII: deephyper_benchmark.search.MPIDistributedOptuna(..., sampler="NSGAII")
QMC: deephyper_benchmark.search.MPIDistributedOptuna(..., sampler="QMC")
SMAC: deephyper_benchmark.search.SMAC(...)

Name		Name	Last commit message	Last commit date
Latest commit History 275 Commits
examples		examples
src/deephyper_benchmark		src/deephyper_benchmark
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeepHyper Benchmark

Table of Contents

Introduction

Organization of the Repository

Installation

Defining a Benchmark

Configuration

Standard Metadata

List of Benchmarks

List of Optimization Algorithm

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

License

deephyper/benchmark

Folders and files

Latest commit

History

Repository files navigation

DeepHyper Benchmark

Table of Contents

Introduction

Organization of the Repository

Installation

Defining a Benchmark

Configuration

Standard Metadata

List of Benchmarks

List of Optimization Algorithm

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages