Add recommendation score (#4156)

* Start adding helper functions * Finish adding helper functions * Add working tests for helper functions * Add get_recommendation_score to AutoMLSearch * Add ranking score to results generation * Only normalize metrics if not bounded like percentage * Add check and test for imbalanced data (not working yet) * Add argument verification and tests to automl init * Add ability to include ranking-only metrics in recommendation score * Test fixes * A few more test fixes * Codecov fixes and release notes * Augment get_recommendation_scores to be more user-friendly * Add recommendation score to docs * PR comments * Lint * PR comments * Add custom_objectives dict * Refactor prioritized_objective to be a custom weight * Just a bit of cleanup * Remove prioritized_weight argument * Test fixes after removing priority weight argument * MAE should NOT be bounded_like_percentage YIKES --------- Co-authored-by: chukarsten <64713315+chukarsten@users.noreply.github.com>
alteryx · May 9, 2023 · d777c6c · d777c6c
1 parent b530abd
commit d777c6c
Show file tree

Hide file tree

Showing 10 changed files with 902 additions and 69 deletions.
diff --git a/docs/source/api_index.rst b/docs/source/api_index.rst
@@ -366,13 +366,17 @@ Objective Utils
     :nosignatures:
 
     evalml.objectives.get_all_objective_names
+    evalml.objectives.get_default_recommendation_objectives
     evalml.objectives.get_core_objectives
     evalml.objectives.get_core_objective_names
     evalml.objectives.get_non_core_objectives
     evalml.objectives.get_objective
     evalml.objectives.get_optimization_objectives
     evalml.objectives.get_ranking_objectives
+    evalml.objectives.normalize_objectives
+    evalml.objectives.organize_objectives
     evalml.objectives.ranking_only_objectives
+    evalml.objectives.recommendation_score
 
 
 Problem Types

diff --git a/docs/source/release_notes.rst b/docs/source/release_notes.rst
@@ -2,6 +2,7 @@ Release Notes
 -------------
 **Future Releases**
     * Enhancements
+        * Added optional ``recommendation_score`` to rank pipelines during AutoMLSearch :pr:`4156`
         * Added BytesIO support to PipelinBase.load() :pr:`4179`
     * Fixes
         * Capped numpy at <=1.23.5 as a temporary measure for SHAP :pr:`4172`

diff --git a/docs/source/user_guide/automl.ipynb b/docs/source/user_guide/automl.ipynb
@@ -396,6 +396,84 @@
     "automl.rankings"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Recommendation Score\n",
+    "\n",
+    "If you would like a more robust evaluation of the performance of your models, EvalML additionally provides a recommendation score alongside the selected objective. The recommendation score is a weighted average of a number of default objectives for your problem type, normalized and scaled so that the final score can be interpreted as a percentage from 0 to 100. This weighted score provides a more holistic understanding of model performance, and prioritizes model generalizability rather than one single objective which may not completely serve your use case."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "automl.get_recommendation_scores(use_pipeline_names=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "automl.get_recommendation_scores(priority=\"F1\", use_pipeline_names=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To see what objectives are included in the recommendation score, you can use:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "evalml.objectives.get_default_recommendation_objectives(\"binary\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If you would like to automatically rank your pipelines by this recommendation score, you can set `use_recommendation=True` when initializing `AutoMLSearch`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "automl_recommendation = evalml.automl.AutoMLSearch(\n",
+    "    X_train=X_train,\n",
+    "    y_train=y_train,\n",
+    "    X_holdout=X_holdout,\n",
+    "    y_holdout=y_holdout,\n",
+    "    problem_type=\"binary\",\n",
+    "    use_recommendation=True,\n",
+    ")\n",
+    "automl_recommendation.search(interactive_plot=False)\n",
+    "\n",
+    "automl_recommendation.rankings[\n",
+    "    [\n",
+    "        \"id\",\n",
+    "        \"pipeline_name\",\n",
+    "        \"search_order\",\n",
+    "        \"recommendation_score\",\n",
+    "        \"holdout_score\",\n",
+    "        \"mean_cv_score\",\n",
+    "    ]\n",
+    "]"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -1092,7 +1170,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3.8.5 ('evalml')",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },