diff --git a/examples/algorithm-stability-yaml/analyze-results.ipynb b/examples/algorithm-stability-yaml/analyze-results.ipynb
new file mode 100644
index 0000000..e7b9c4e
--- /dev/null
+++ b/examples/algorithm-stability-yaml/analyze-results.ipynb
@@ -0,0 +1,1476 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Hi! Let's analyze the results of the experiment you just ran. To recap:\n",
+ "\n",
+ "1. You just ran something similar to:\n",
+ "\n",
+ " `python benchmark.py --multirun ranker=\"glob(*)\" +callbacks.to_sql.url=\"sqlite:///$HOME/results.sqlite\"`\n",
+ "2. There now should exist a `.sqlite` file at this path: `$HOME/results.sqlite`:\n",
+ "\n",
+ " ```\n",
+ " $ ls -al $HOME/results.sqlite\n",
+ " -rw-r--r-- 1 vscode vscode 20480 Sep 21 08:16 /home/vscode/results.sqlite\n",
+ " ```\n",
+ "\n",
+ "Let's now analyze the results! 📈"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "First, we will install `plotly-express`, so we can make nice plots later."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Note: you may need to restart the kernel to use updated packages.\n"
+ ]
+ }
+ ],
+ "source": [
+ "%pip install plotly-express nbconvert --quiet"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Figure out the SQL connection URI."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "'sqlite:////home/vscode/results.sqlite'"
+ ]
+ },
+ "execution_count": 2,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "import os\n",
+ "\n",
+ "con: str = \"sqlite:///\" + os.environ[\"HOME\"] + \"/results.sqlite\"\n",
+ "con"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Read in the `experiments` table. This table contains metadata for all 'experiments' that have been run."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " dataset | \n",
+ " dataset/n | \n",
+ " dataset/p | \n",
+ " dataset/task | \n",
+ " dataset/group | \n",
+ " dataset/domain | \n",
+ " ranker | \n",
+ " validator | \n",
+ " local_dir | \n",
+ " date_created | \n",
+ "
\n",
+ " \n",
+ " id | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 38vqcwus | \n",
+ " Synclf hard | \n",
+ " 10000 | \n",
+ " 50 | \n",
+ " classification | \n",
+ " Synclf | \n",
+ " synthetic | \n",
+ " Boruta | \n",
+ " k-NN | \n",
+ " /workspaces/fseval/examples/algorithm-stabilit... | \n",
+ " 2022-09-21 08:22:28.965510 | \n",
+ "
\n",
+ " \n",
+ " y6bb1hcc | \n",
+ " Synclf hard | \n",
+ " 1000 | \n",
+ " 50 | \n",
+ " classification | \n",
+ " Synclf | \n",
+ " synthetic | \n",
+ " Boruta | \n",
+ " k-NN | \n",
+ " /workspaces/fseval/examples/algorithm-stabilit... | \n",
+ " 2022-09-21 08:22:53.609396 | \n",
+ "
\n",
+ " \n",
+ " 3vtr13pg | \n",
+ " Synclf hard | \n",
+ " 1000 | \n",
+ " 50 | \n",
+ " classification | \n",
+ " Synclf | \n",
+ " synthetic | \n",
+ " ReliefF | \n",
+ " k-NN | \n",
+ " /workspaces/fseval/examples/algorithm-stabilit... | \n",
+ " 2022-09-21 08:25:09.974370 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " dataset dataset/n dataset/p dataset/task dataset/group \\\n",
+ "id \n",
+ "38vqcwus Synclf hard 10000 50 classification Synclf \n",
+ "y6bb1hcc Synclf hard 1000 50 classification Synclf \n",
+ "3vtr13pg Synclf hard 1000 50 classification Synclf \n",
+ "\n",
+ " dataset/domain ranker validator \\\n",
+ "id \n",
+ "38vqcwus synthetic Boruta k-NN \n",
+ "y6bb1hcc synthetic Boruta k-NN \n",
+ "3vtr13pg synthetic ReliefF k-NN \n",
+ "\n",
+ " local_dir \\\n",
+ "id \n",
+ "38vqcwus /workspaces/fseval/examples/algorithm-stabilit... \n",
+ "y6bb1hcc /workspaces/fseval/examples/algorithm-stabilit... \n",
+ "3vtr13pg /workspaces/fseval/examples/algorithm-stabilit... \n",
+ "\n",
+ " date_created \n",
+ "id \n",
+ "38vqcwus 2022-09-21 08:22:28.965510 \n",
+ "y6bb1hcc 2022-09-21 08:22:53.609396 \n",
+ "3vtr13pg 2022-09-21 08:25:09.974370 "
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "import pandas as pd\n",
+ "\n",
+ "experiments: pd.DataFrame = pd.read_sql_table(\"experiments\", con=con, index_col=\"id\")\n",
+ "experiments"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "That's looking good 🙌🏻.\n",
+ "\n",
+ "Now, let's read in the `stability` table. We put data in this table by using our custom-made metric, defined in the `StabilityNogueira` class in `benchmark.py`. There, we push data to this table using `callbacks.on_table`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " index | \n",
+ " stability | \n",
+ "
\n",
+ " \n",
+ " id | \n",
+ " | \n",
+ " | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " y6bb1hcc | \n",
+ " 0 | \n",
+ " 0.933546 | \n",
+ "
\n",
+ " \n",
+ " 3vtr13pg | \n",
+ " 0 | \n",
+ " 1.000000 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " index stability\n",
+ "id \n",
+ "y6bb1hcc 0 0.933546\n",
+ "3vtr13pg 0 1.000000"
+ ]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "stability: pd.DataFrame = pd.read_sql_table(\"stability\", con=con, index_col=\"id\")\n",
+ "stability"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Cool. Now let's join the experiments with their actual metrics."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " index | \n",
+ " stability | \n",
+ " dataset | \n",
+ " dataset/n | \n",
+ " dataset/p | \n",
+ " dataset/task | \n",
+ " dataset/group | \n",
+ " dataset/domain | \n",
+ " ranker | \n",
+ " validator | \n",
+ " local_dir | \n",
+ " date_created | \n",
+ "
\n",
+ " \n",
+ " id | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " y6bb1hcc | \n",
+ " 0 | \n",
+ " 0.933546 | \n",
+ " Synclf hard | \n",
+ " 1000 | \n",
+ " 50 | \n",
+ " classification | \n",
+ " Synclf | \n",
+ " synthetic | \n",
+ " Boruta | \n",
+ " k-NN | \n",
+ " /workspaces/fseval/examples/algorithm-stabilit... | \n",
+ " 2022-09-21 08:22:53.609396 | \n",
+ "
\n",
+ " \n",
+ " 3vtr13pg | \n",
+ " 0 | \n",
+ " 1.000000 | \n",
+ " Synclf hard | \n",
+ " 1000 | \n",
+ " 50 | \n",
+ " classification | \n",
+ " Synclf | \n",
+ " synthetic | \n",
+ " ReliefF | \n",
+ " k-NN | \n",
+ " /workspaces/fseval/examples/algorithm-stabilit... | \n",
+ " 2022-09-21 08:25:09.974370 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " index stability dataset dataset/n dataset/p dataset/task \\\n",
+ "id \n",
+ "y6bb1hcc 0 0.933546 Synclf hard 1000 50 classification \n",
+ "3vtr13pg 0 1.000000 Synclf hard 1000 50 classification \n",
+ "\n",
+ " dataset/group dataset/domain ranker validator \\\n",
+ "id \n",
+ "y6bb1hcc Synclf synthetic Boruta k-NN \n",
+ "3vtr13pg Synclf synthetic ReliefF k-NN \n",
+ "\n",
+ " local_dir \\\n",
+ "id \n",
+ "y6bb1hcc /workspaces/fseval/examples/algorithm-stabilit... \n",
+ "3vtr13pg /workspaces/fseval/examples/algorithm-stabilit... \n",
+ "\n",
+ " date_created \n",
+ "id \n",
+ "y6bb1hcc 2022-09-21 08:22:53.609396 \n",
+ "3vtr13pg 2022-09-21 08:25:09.974370 "
+ ]
+ },
+ "execution_count": 5,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "stability_experiments = stability.join(experiments)\n",
+ "stability_experiments"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Finally, we can plot the results so we can get a better grasp of what's going on:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ " \n",
+ " "
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "application/vnd.plotly.v1+json": {
+ "config": {
+ "plotlyServerURL": "https://plot.ly"
+ },
+ "data": [
+ {
+ "alignmentgroup": "True",
+ "hovertemplate": "ranker=%{x}
stability=%{y}",
+ "legendgroup": "",
+ "marker": {
+ "color": "#636efa",
+ "pattern": {
+ "shape": ""
+ }
+ },
+ "name": "",
+ "offsetgroup": "",
+ "orientation": "v",
+ "showlegend": false,
+ "textposition": "auto",
+ "type": "bar",
+ "x": [
+ "Boruta",
+ "ReliefF"
+ ],
+ "xaxis": "x",
+ "y": [
+ 0.9335459861775651,
+ 1
+ ],
+ "yaxis": "y"
+ }
+ ],
+ "layout": {
+ "barmode": "relative",
+ "legend": {
+ "tracegroupgap": 0
+ },
+ "margin": {
+ "t": 60
+ },
+ "template": {
+ "data": {
+ "bar": [
+ {
+ "error_x": {
+ "color": "#2a3f5f"
+ },
+ "error_y": {
+ "color": "#2a3f5f"
+ },
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "bar"
+ }
+ ],
+ "barpolar": [
+ {
+ "marker": {
+ "line": {
+ "color": "#E5ECF6",
+ "width": 0.5
+ },
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "barpolar"
+ }
+ ],
+ "carpet": [
+ {
+ "aaxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "baxis": {
+ "endlinecolor": "#2a3f5f",
+ "gridcolor": "white",
+ "linecolor": "white",
+ "minorgridcolor": "white",
+ "startlinecolor": "#2a3f5f"
+ },
+ "type": "carpet"
+ }
+ ],
+ "choropleth": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "choropleth"
+ }
+ ],
+ "contour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "contour"
+ }
+ ],
+ "contourcarpet": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "contourcarpet"
+ }
+ ],
+ "heatmap": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmap"
+ }
+ ],
+ "heatmapgl": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "heatmapgl"
+ }
+ ],
+ "histogram": [
+ {
+ "marker": {
+ "pattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ }
+ },
+ "type": "histogram"
+ }
+ ],
+ "histogram2d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2d"
+ }
+ ],
+ "histogram2dcontour": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "histogram2dcontour"
+ }
+ ],
+ "mesh3d": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "type": "mesh3d"
+ }
+ ],
+ "parcoords": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "parcoords"
+ }
+ ],
+ "pie": [
+ {
+ "automargin": true,
+ "type": "pie"
+ }
+ ],
+ "scatter": [
+ {
+ "fillpattern": {
+ "fillmode": "overlay",
+ "size": 10,
+ "solidity": 0.2
+ },
+ "type": "scatter"
+ }
+ ],
+ "scatter3d": [
+ {
+ "line": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatter3d"
+ }
+ ],
+ "scattercarpet": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattercarpet"
+ }
+ ],
+ "scattergeo": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergeo"
+ }
+ ],
+ "scattergl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattergl"
+ }
+ ],
+ "scattermapbox": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scattermapbox"
+ }
+ ],
+ "scatterpolar": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolar"
+ }
+ ],
+ "scatterpolargl": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterpolargl"
+ }
+ ],
+ "scatterternary": [
+ {
+ "marker": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "type": "scatterternary"
+ }
+ ],
+ "surface": [
+ {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ },
+ "colorscale": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "type": "surface"
+ }
+ ],
+ "table": [
+ {
+ "cells": {
+ "fill": {
+ "color": "#EBF0F8"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "header": {
+ "fill": {
+ "color": "#C8D4E3"
+ },
+ "line": {
+ "color": "white"
+ }
+ },
+ "type": "table"
+ }
+ ]
+ },
+ "layout": {
+ "annotationdefaults": {
+ "arrowcolor": "#2a3f5f",
+ "arrowhead": 0,
+ "arrowwidth": 1
+ },
+ "autotypenumbers": "strict",
+ "coloraxis": {
+ "colorbar": {
+ "outlinewidth": 0,
+ "ticks": ""
+ }
+ },
+ "colorscale": {
+ "diverging": [
+ [
+ 0,
+ "#8e0152"
+ ],
+ [
+ 0.1,
+ "#c51b7d"
+ ],
+ [
+ 0.2,
+ "#de77ae"
+ ],
+ [
+ 0.3,
+ "#f1b6da"
+ ],
+ [
+ 0.4,
+ "#fde0ef"
+ ],
+ [
+ 0.5,
+ "#f7f7f7"
+ ],
+ [
+ 0.6,
+ "#e6f5d0"
+ ],
+ [
+ 0.7,
+ "#b8e186"
+ ],
+ [
+ 0.8,
+ "#7fbc41"
+ ],
+ [
+ 0.9,
+ "#4d9221"
+ ],
+ [
+ 1,
+ "#276419"
+ ]
+ ],
+ "sequential": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ],
+ "sequentialminus": [
+ [
+ 0,
+ "#0d0887"
+ ],
+ [
+ 0.1111111111111111,
+ "#46039f"
+ ],
+ [
+ 0.2222222222222222,
+ "#7201a8"
+ ],
+ [
+ 0.3333333333333333,
+ "#9c179e"
+ ],
+ [
+ 0.4444444444444444,
+ "#bd3786"
+ ],
+ [
+ 0.5555555555555556,
+ "#d8576b"
+ ],
+ [
+ 0.6666666666666666,
+ "#ed7953"
+ ],
+ [
+ 0.7777777777777778,
+ "#fb9f3a"
+ ],
+ [
+ 0.8888888888888888,
+ "#fdca26"
+ ],
+ [
+ 1,
+ "#f0f921"
+ ]
+ ]
+ },
+ "colorway": [
+ "#636efa",
+ "#EF553B",
+ "#00cc96",
+ "#ab63fa",
+ "#FFA15A",
+ "#19d3f3",
+ "#FF6692",
+ "#B6E880",
+ "#FF97FF",
+ "#FECB52"
+ ],
+ "font": {
+ "color": "#2a3f5f"
+ },
+ "geo": {
+ "bgcolor": "white",
+ "lakecolor": "white",
+ "landcolor": "#E5ECF6",
+ "showlakes": true,
+ "showland": true,
+ "subunitcolor": "white"
+ },
+ "hoverlabel": {
+ "align": "left"
+ },
+ "hovermode": "closest",
+ "mapbox": {
+ "style": "light"
+ },
+ "paper_bgcolor": "white",
+ "plot_bgcolor": "#E5ECF6",
+ "polar": {
+ "angularaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "radialaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "scene": {
+ "xaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "yaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ },
+ "zaxis": {
+ "backgroundcolor": "#E5ECF6",
+ "gridcolor": "white",
+ "gridwidth": 2,
+ "linecolor": "white",
+ "showbackground": true,
+ "ticks": "",
+ "zerolinecolor": "white"
+ }
+ },
+ "shapedefaults": {
+ "line": {
+ "color": "#2a3f5f"
+ }
+ },
+ "ternary": {
+ "aaxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "baxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ },
+ "bgcolor": "#E5ECF6",
+ "caxis": {
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": ""
+ }
+ },
+ "title": {
+ "x": 0.05
+ },
+ "xaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ },
+ "yaxis": {
+ "automargin": true,
+ "gridcolor": "white",
+ "linecolor": "white",
+ "ticks": "",
+ "title": {
+ "standoff": 15
+ },
+ "zerolinecolor": "white",
+ "zerolinewidth": 2
+ }
+ }
+ },
+ "xaxis": {
+ "anchor": "y",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "ranker"
+ }
+ },
+ "yaxis": {
+ "anchor": "x",
+ "domain": [
+ 0,
+ 1
+ ],
+ "title": {
+ "text": "stability"
+ }
+ }
+ }
+ },
+ "text/html": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "import plotly.express as px\n",
+ "\n",
+ "px.bar(stability_experiments,\n",
+ " x=\"ranker\",\n",
+ " y=\"stability\"\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We can now observe that for Boruta and ReliefF, ReliefF is the most 'stable' given this dataset, getting 100% the same features for all 10 bootstraps that were run."
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3.9.12 64-bit",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.9.12"
+ },
+ "orig_nbformat": 4,
+ "vscode": {
+ "interpreter": {
+ "hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
+ }
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/examples/algorithm-stability-yaml/benchmark.py b/examples/algorithm-stability-yaml/benchmark.py
new file mode 100644
index 0000000..099c905
--- /dev/null
+++ b/examples/algorithm-stability-yaml/benchmark.py
@@ -0,0 +1,111 @@
+from typing import Dict, Optional, Union
+
+import hydra
+import numpy as np
+import pandas as pd
+from skrebate import ReliefF
+
+from fseval.config import PipelineConfig
+from fseval.main import run_pipeline
+from fseval.types import AbstractEstimator, AbstractMetric, Callback
+
+"""
+The checkInputType and getStability functions come from the following paper:
+
+[1] On the Stability of Feature Selection. Sarah Nogueira, Konstantinos Sechidis, Gavin Brown.
+ Journal of Machine Learning Reasearch (JMLR). 2017.
+You can find a full demo using this package at:
+http://htmlpreview.github.io/?https://github.com/nogueirs/JMLR2017/blob/master/python/stabilityDemo.html
+NB: This package requires the installation of the packages: numpy, scipy and math
+"""
+
+
+def checkInputType(Z):
+ """This function checks that Z is of the rigt type and dimension.
+ It raises an exception if not.
+ OUTPUT: The input Z as a numpy.ndarray
+ """
+ ### We check that Z is a list or a numpy.array
+ if isinstance(Z, list):
+ Z = np.asarray(Z)
+ elif not isinstance(Z, np.ndarray):
+ raise ValueError("The input matrix Z should be of type list or numpy.ndarray")
+ ### We check if Z is a matrix (2 dimensions)
+ if Z.ndim != 2:
+ raise ValueError("The input matrix Z should be of dimension 2")
+ return Z
+
+
+def getStability(Z):
+ """
+ Let us assume we have M>1 feature sets and d>0 features in total.
+ This function computes the stability estimate as given in Definition 4 in [1].
+
+ INPUT: A BINARY matrix Z (given as a list or as a numpy.ndarray of size M*d).
+ Each row of the binary matrix represents a feature set, where a 1 at the f^th position
+ means the f^th feature has been selected and a 0 means it has not been selected.
+
+ OUTPUT: The stability of the feature selection procedure
+ """
+ Z = checkInputType(Z)
+ M, d = Z.shape
+ hatPF = np.mean(Z, axis=0)
+ kbar = np.sum(hatPF)
+ denom = (kbar / d) * (1 - kbar / d)
+ return 1 - (M / (M - 1)) * np.mean(np.multiply(hatPF, 1 - hatPF)) / denom
+
+
+class StabilityNogueira(AbstractMetric):
+ def score_bootstrap(
+ self,
+ ranker: AbstractEstimator,
+ validator: AbstractEstimator,
+ callbacks: Callback,
+ scores: Dict,
+ **kwargs,
+ ) -> Dict:
+ # compute stability and send to table
+ Z = np.array(self.support_matrix)
+ Z = Z.astype(int)
+ stability = getStability(Z)
+ stability_df = pd.DataFrame([{"stability": stability}])
+ callbacks.on_table(stability_df, "stability")
+
+ # set in scores dict
+ scores["stability"] = stability
+
+ return scores
+
+ def score_ranking(
+ self,
+ scores: Union[Dict, pd.DataFrame],
+ ranker: AbstractEstimator,
+ bootstrap_state: int,
+ callbacks: Callback,
+ feature_importances: Optional[np.ndarray] = None,
+ ):
+ support_matrix = getattr(self, "support_matrix", [])
+ self.support_matrix = support_matrix
+ self.support_matrix.append(ranker.feature_support_)
+
+
+class ReliefF_FeatureSelection(ReliefF):
+ def fit(self, X, y):
+ super(ReliefF_FeatureSelection, self).fit(X, y)
+
+ # extract feature subset from ReliefF
+ feature_subset = self.top_features_[: self.n_features_to_select]
+
+ # set `support_` vector
+ _, p = np.shape(X)
+ self.support_ = np.zeros(p, dtype=bool)
+ self.support_[feature_subset] = True
+
+
+@hydra.main(config_path="conf", config_name="my_config")
+def main(cfg: PipelineConfig) -> None:
+ run_pipeline(cfg)
+
+
+if __name__ == "__main__":
+ main()
diff --git a/examples/algorithm-stability-yaml/conf/dataset/synclf_hard.yaml b/examples/algorithm-stability-yaml/conf/dataset/synclf_hard.yaml
new file mode 100644
index 0000000..325d064
--- /dev/null
+++ b/examples/algorithm-stability-yaml/conf/dataset/synclf_hard.yaml
@@ -0,0 +1,18 @@
+name: Synclf hard
+task: classification
+domain: synthetic
+group: Synclf
+adapter:
+ _target_: sklearn.datasets.make_classification
+ class_sep: 0.8
+ n_classes: 3
+ n_clusters_per_class: 3
+ n_features: 50
+ n_informative: 4
+ n_redundant: 0
+ n_repeated: 0
+ n_samples: 1000
+ random_state: 0
+ shuffle: false
+feature_importances:
+ X[:, 0:4]: 1.0
diff --git a/examples/algorithm-stability-yaml/conf/metrics/stability_nogueira.yaml b/examples/algorithm-stability-yaml/conf/metrics/stability_nogueira.yaml
new file mode 100644
index 0000000..5915374
--- /dev/null
+++ b/examples/algorithm-stability-yaml/conf/metrics/stability_nogueira.yaml
@@ -0,0 +1,3 @@
+# @package metrics
+ranking_scores:
+ _target_: benchmark.StabilityNogueira
diff --git a/examples/algorithm-stability-yaml/conf/my_config.yaml b/examples/algorithm-stability-yaml/conf/my_config.yaml
new file mode 100644
index 0000000..39d9662
--- /dev/null
+++ b/examples/algorithm-stability-yaml/conf/my_config.yaml
@@ -0,0 +1,11 @@
+defaults:
+ - base_pipeline_config
+ - _self_
+ - override dataset: synclf_hard
+ - override validator: knn
+ - override /callbacks:
+ - to_sql
+ - override /metrics:
+ - stability_nogueira
+
+n_bootstraps: 10
diff --git a/examples/algorithm-stability-yaml/conf/ranker/boruta.yaml b/examples/algorithm-stability-yaml/conf/ranker/boruta.yaml
new file mode 100644
index 0000000..ef7e0e4
--- /dev/null
+++ b/examples/algorithm-stability-yaml/conf/ranker/boruta.yaml
@@ -0,0 +1,11 @@
+name: Boruta
+estimator:
+ _target_: boruta.boruta_py.BorutaPy
+ estimator:
+ _target_: sklearn.ensemble.RandomForestClassifier
+ n_estimators: auto
+_estimator_type: classifier
+multioutput: false
+estimates_feature_importances: false
+estimates_feature_support: true
+estimates_feature_ranking: true
\ No newline at end of file
diff --git a/examples/algorithm-stability-yaml/conf/ranker/relieff.yaml b/examples/algorithm-stability-yaml/conf/ranker/relieff.yaml
new file mode 100644
index 0000000..3b36a4c
--- /dev/null
+++ b/examples/algorithm-stability-yaml/conf/ranker/relieff.yaml
@@ -0,0 +1,7 @@
+name: ReliefF
+estimator:
+ _target_: benchmark.ReliefF_FeatureSelection
+ n_features_to_select: 10 # select best 10 features in feature subset.
+_estimator_type: classifier
+estimates_feature_importances: true
+estimates_feature_support: true
diff --git a/examples/algorithm-stability-yaml/conf/validator/knn.yaml b/examples/algorithm-stability-yaml/conf/validator/knn.yaml
new file mode 100644
index 0000000..7a3b4c5
--- /dev/null
+++ b/examples/algorithm-stability-yaml/conf/validator/knn.yaml
@@ -0,0 +1,6 @@
+name: k-NN
+estimator:
+ _target_: sklearn.neighbors.KNeighborsClassifier
+_estimator_type: classifier
+multioutput: false
+estimates_target: true
diff --git a/fseval/callbacks/to_sql.py b/fseval/callbacks/to_sql.py
index 4c4f410..3d88cbe 100644
--- a/fseval/callbacks/to_sql.py
+++ b/fseval/callbacks/to_sql.py
@@ -3,11 +3,11 @@
from typing import Dict
import pandas as pd
-from fseval.config.callbacks.to_sql import ToSQLCallback
-from fseval.types import TerminalColor
from omegaconf import MISSING, DictConfig
from sqlalchemy import create_engine
-from sqlalchemy.pool import NullPool
+
+from fseval.config.callbacks.to_sql import ToSQLCallback
+from fseval.types import TerminalColor
from ._base_export_callback import BaseExportCallback
diff --git a/fseval/pipelines/_experiment.py b/fseval/pipelines/_experiment.py
index 14c7e70..06170a8 100644
--- a/fseval/pipelines/_experiment.py
+++ b/fseval/pipelines/_experiment.py
@@ -7,10 +7,10 @@
import numpy as np
import pandas as pd
+from humanfriendly import format_timespan
+
from fseval.pipeline.estimator import Estimator
from fseval.types import AbstractEstimator, Callback, TerminalColor
-from humanfriendly import format_timespan
-from sqlalchemy.engine import Engine
@dataclass
diff --git a/fseval/pipelines/rank_and_validate/_support_validator.py b/fseval/pipelines/rank_and_validate/_support_validator.py
index 9acda9f..f4db467 100644
--- a/fseval/pipelines/rank_and_validate/_support_validator.py
+++ b/fseval/pipelines/rank_and_validate/_support_validator.py
@@ -63,10 +63,11 @@ def score(self, X, y, **kwargs) -> Union[Dict, pd.DataFrame, np.generic, None]:
scores = pd.DataFrame([scores_dict])
# add custom metrics
+ X_, y_ = self._prepare_data(X, y)
+
for metric_name, metric_class in self.metrics.items():
- X, y = self._prepare_data(X, y)
scores_metric = metric_class.score_support( # type: ignore
- scores, self.validator, X, y, self.callbacks
+ scores, self.validator, X_, y_, self.callbacks
) # type: ignore
if scores_metric is not None:
diff --git a/tests/integration/test_main.py b/tests/integration/test_main.py
index 81d64c8..bbaf72d 100644
--- a/tests/integration/test_main.py
+++ b/tests/integration/test_main.py
@@ -2,9 +2,10 @@
import tempfile
import pytest
+from hydra.conf import ConfigStore
+
from fseval.config import EstimatorConfig, PipelineConfig
from fseval.main import run_pipeline
-from fseval.types import IncompatibilityError
from fseval.utils.hydra_utils import get_config
from hydra.conf import ConfigStore
from hydra.errors import InstantiationException
diff --git a/website/docs/_recipes/algorithm-stability.md b/website/docs/_recipes/algorithm-stability.md
deleted file mode 100644
index 9ac8290..0000000
--- a/website/docs/_recipes/algorithm-stability.md
+++ /dev/null
@@ -1 +0,0 @@
-# Analyze algorithm stability
\ No newline at end of file
diff --git a/website/docs/_recipes/running-on-aws.md b/website/docs/_recipes/running-on-aws.md
deleted file mode 100644
index ab76c2f..0000000
--- a/website/docs/_recipes/running-on-aws.md
+++ /dev/null
@@ -1,3 +0,0 @@
-# Running on AWS
-
-evaluate!
\ No newline at end of file
diff --git a/website/docs/_recipes/running-on-slurm.md b/website/docs/_recipes/running-on-slurm.md
deleted file mode 100644
index 7474275..0000000
--- a/website/docs/_recipes/running-on-slurm.md
+++ /dev/null
@@ -1 +0,0 @@
-# Running on a SLURM cluster
\ No newline at end of file
diff --git a/website/docs/quick-start.mdx b/website/docs/quick-start.mdx
index bdb48da..ca1b704 100644
--- a/website/docs/quick-start.mdx
+++ b/website/docs/quick-start.mdx
@@ -144,6 +144,14 @@ We can now decide how to export the results. We can upload our results to a live
sql_con=sqlite:////Users/dunnkers/Downloads/results.sqlite # any well-defined database URL
```
+:::note Relative vs absolute paths
+
+If you define a _relative_ database URL, like `sql_con=sqlite:///./results.sqlite`, the results will be saved right where Hydra stores its individual run files. In other words, multiple `.sqlite` files are stored in the `./multirun` subfolders.
+
+To prevent this, and store all results in 1 `.sqlite` file, use an **absolute** path, like above. But preferably, you are using a proper running database - see the recipes for more instructions on this.
+
+:::
+
We are now ready to run an experiment. In a terminal, `cd` into the unzipped example directory and run the following:
```shell
python benchmark.py --multirun ranker='glob(*)' +callbacks.to_sql.url=$sql_con
diff --git a/website/docs/_recipes/_category_.json b/website/docs/recipes/_category_.json
similarity index 100%
rename from website/docs/_recipes/_category_.json
rename to website/docs/recipes/_category_.json
diff --git a/website/docs/recipes/algorithm-stability.md b/website/docs/recipes/algorithm-stability.md
new file mode 100644
index 0000000..3f9d42c
--- /dev/null
+++ b/website/docs/recipes/algorithm-stability.md
@@ -0,0 +1,320 @@
+# Analyze algorithm stability
+
+For many applications, it is very important the algorithms that are used are **stable** enough. This means, that when a different sample of data is taken from some distribution, the results will turn out similar. This, combined with possible inherent stochastic properties of an algorithm, make up for the _stability_ of the algorithm. The same applies to Feature Selection or Feature Ranking algorithms.
+
+Therefore, let's do such an experiment! We are going to compare the stability of [ReliefF](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.56.4740&rep=rep1&type=pdf) to [Boruta](https://www.jstatsoft.org/article/view/v036i11), two popular feature selection algorithms. We are going to do this using a metric introduced in [Nogueira et al, 2018](https://www.jmlr.org/papers/volume18/17-514/17-514.pdf).
+
+
+## The experiment
+
+We are going to run an experiment with the following [configuration](https://github.com/dunnkers/fseval/tree/master/examples/algorithm-stability-yaml/).
+
+Download the experiment config: [algorithm-stability-yaml.zip](pathname:///fseval/zipped-examples/algorithm-stability-yaml.zip)
+
+Most notably are the following configuration settings:
+
+```yaml title="my_config.yaml"
+defaults:
+ - base_pipeline_config
+ - _self_
+ - override dataset: synclf_hard
+ - override validator: knn
+ - override /callbacks:
+ - to_sql
+ // highlight-start
+ - override /metrics:
+ - stability_nogueira
+ // highlight-end
+
+// highlight-start
+n_bootstraps: 10
+// highlight-end
+```
+
+That means, we are going to generate a synthetic dataset and sample 10 subsets from it. This is because `n_bootstraps=10`. Then, after the feature selection algorithm was executed and fitted on the dataset, a custom installed metric will be executed, called `stability_nogueira`. This can be found in the `/conf/metrics` folder, which in turn refers to a class in the `benchmark.py` file.
+
+To now run the experiment, run the following command inside the `algorithm-stability-yaml` folder:
+
+```shell
+python benchmark.py --multirun ranker="glob(*)" +callbacks.to_sql.url="sqlite:///$HOME/results.sqlite"
+```
+
+## Analyzing the results
+
+### Recap
+
+Hi! Let's analyze the results of the experiment you just ran. To **recap**:
+
+1. You just ran something similar to:
+
+ `python benchmark.py --multirun ranker="glob(*)" +callbacks.to_sql.url="sqlite:///$HOME/results.sqlite"`
+2. There now should exist a `.sqlite` file at this path: `$HOME/results.sqlite`:
+
+ ```
+ $ ls -al $HOME/results.sqlite
+ -rw-r--r-- 1 vscode vscode 20480 Sep 21 08:16 /home/vscode/results.sqlite
+ ```
+
+Let's now analyze the results! 📈
+
+### Analysis
+
+> The rest of the text assumes all code was ran inside a Jupyter Notebook, in chronological order. The source Notebook can be found [here](https://github.com/dunnkers/fseval/tree/master/examples/algorithm-stability-yaml/analyze-results.ipynb)
+
+First, we will install `plotly-express`, so we can make nice plots later.
+
+
+```python
+%pip install plotly-express --quiet
+```
+
+
+Figure out the SQL connection URI.
+
+
+```python
+import os
+
+con: str = "sqlite:///" + os.environ["HOME"] + "/results.sqlite"
+con
+```
+
+
+
+
+ 'sqlite:////home/vscode/results.sqlite'
+
+
+
+Read in the `experiments` table. This table contains metadata for all 'experiments' that have been run.
+
+
+```python
+import pandas as pd
+
+experiments: pd.DataFrame = pd.read_sql_table("experiments", con=con, index_col="id")
+experiments
+```
+
+
+
+
+
+
+
+
+ |
+ dataset |
+ dataset/n |
+ dataset/p |
+ dataset/task |
+ dataset/group |
+ dataset/domain |
+ ranker |
+ validator |
+ local_dir |
+ date_created |
+
+
+ id |
+ |
+ |
+ |
+ |
+ |
+ |
+ |
+ |
+ |
+ |
+
+
+
+
+ 38vqcwus |
+ Synclf hard |
+ 10000 |
+ 50 |
+ classification |
+ Synclf |
+ synthetic |
+ Boruta |
+ k-NN |
+ /workspaces/fseval/examples/algorithm-stabilit... |
+ 2022-09-21 08:22:28.965510 |
+
+
+ y6bb1hcc |
+ Synclf hard |
+ 1000 |
+ 50 |
+ classification |
+ Synclf |
+ synthetic |
+ Boruta |
+ k-NN |
+ /workspaces/fseval/examples/algorithm-stabilit... |
+ 2022-09-21 08:22:53.609396 |
+
+
+ 3vtr13pg |
+ Synclf hard |
+ 1000 |
+ 50 |
+ classification |
+ Synclf |
+ synthetic |
+ ReliefF |
+ k-NN |
+ /workspaces/fseval/examples/algorithm-stabilit... |
+ 2022-09-21 08:25:09.974370 |
+
+
+
+
+
+
+
+That's looking good 🙌🏻.
+
+Now, let's read in the `stability` table. We put data in this table by using our custom-made metric, defined in the `StabilityNogueira` class in `benchmark.py`. There, we push data to this table using `callbacks.on_table`.
+
+
+```python
+stability: pd.DataFrame = pd.read_sql_table("stability", con=con, index_col="id")
+stability
+```
+
+
+
+
+
+
+
+
+ |
+ index |
+ stability |
+
+
+ id |
+ |
+ |
+
+
+
+
+ y6bb1hcc |
+ 0 |
+ 0.933546 |
+
+
+ 3vtr13pg |
+ 0 |
+ 1.000000 |
+
+
+
+
+
+
+
+Cool. Now let's join the experiments with their actual metrics.
+
+
+```python
+stability_experiments = stability.join(experiments)
+stability_experiments
+```
+
+
+
+
+
+
+
+
+ |
+ index |
+ stability |
+ dataset |
+ dataset/n |
+ dataset/p |
+ dataset/task |
+ dataset/group |
+ dataset/domain |
+ ranker |
+ validator |
+ local_dir |
+ date_created |
+
+
+ id |
+ |
+ |
+ |
+ |
+ |
+ |
+ |
+ |
+ |
+ |
+ |
+ |
+
+
+
+
+ y6bb1hcc |
+ 0 |
+ 0.933546 |
+ Synclf hard |
+ 1000 |
+ 50 |
+ classification |
+ Synclf |
+ synthetic |
+ Boruta |
+ k-NN |
+ /workspaces/fseval/examples/algorithm-stabilit... |
+ 2022-09-21 08:22:53.609396 |
+
+
+ 3vtr13pg |
+ 0 |
+ 1.000000 |
+ Synclf hard |
+ 1000 |
+ 50 |
+ classification |
+ Synclf |
+ synthetic |
+ ReliefF |
+ k-NN |
+ /workspaces/fseval/examples/algorithm-stabilit... |
+ 2022-09-21 08:25:09.974370 |
+
+
+
+
+
+
+
+Finally, we can plot the results so we can get a better grasp of what's going on:
+
+
+```python
+import plotly.express as px
+
+px.bar(stability_experiments,
+ x="ranker",
+ y="stability"
+)
+```
+
+
+![feature selectors algorithm stability](/img/recipes/feature-selectors-stability-barplot.png)
+
+
+We can now observe that for Boruta and ReliefF, ReliefF is the most 'stable' given this dataset, getting 100% the same features for all 10 bootstraps that were run.
diff --git a/website/static/img/recipes/feature-selectors-stability-barplot.png b/website/static/img/recipes/feature-selectors-stability-barplot.png
new file mode 100644
index 0000000..41c70bf
Binary files /dev/null and b/website/static/img/recipes/feature-selectors-stability-barplot.png differ
diff --git a/website/static/zipped-examples/algorithm-stability-yaml.zip b/website/static/zipped-examples/algorithm-stability-yaml.zip
new file mode 100644
index 0000000..9c83ba2
Binary files /dev/null and b/website/static/zipped-examples/algorithm-stability-yaml.zip differ