remove model saving and header

HelmholtzAI-Consultants-Munich · Sep 11, 2024 · d6fd9eb · d6fd9eb
1 parent 1c05740
commit d6fd9eb
Show file tree

Hide file tree

Showing 2 changed files with 8 additions and 22 deletions.
diff --git a/xai-for-random-forest/Bio-1-Tutorial_RandomForest_Models.ipynb b/xai-for-random-forest/Bio-1-Tutorial_RandomForest_Models.ipynb
@@ -3480,26 +3480,6 @@
    "source": [
     "Using the balanced accuracy shows us that the model performs very well on the training set but very poorly on the test set, i.e. fails to generalize to unseen data. This behaviour is a sign of model **overfitting**, where the model learns the training data too well, to the point that it captures not only the underlying patterns but also the noise and random fluctuations in the data. "
    ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Let's now save the model in a ``pickle`` file, such that we can load the trained model into other notebooks later on."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 42,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Save the model with joblib\n",
-    "data_and_model = [X_train, X_test, y_train, y_test, rf]\n",
-    "\n",
-    "with open('../models/model_rf_cervicalcancer.pickle', 'wb') as handle:\n",
-    "    pickle.dump(data_and_model, handle, protocol=pickle.HIGHEST_PROTOCOL)"
-   ]
   }
  ],
  "metadata": {

diff --git a/xai-for-random-forest/Bio-2-Tutorial_PermutationFeatureImportance.ipynb b/xai-for-random-forest/Bio-2-Tutorial_PermutationFeatureImportance.ipynb
@@ -287,8 +287,6 @@
    "id": "029ea700",
    "metadata": {},
    "source": [
-    "### Permutation Feature Importance\n",
-    "\n",
     "Now, let's use Permutation Feature Importance to get insights into the Random Forest Classification model we loaded above. We can use the scikit-learn implementation called `permutation_importance` to get the importance values for the features in our model. For measuring the performance drop when permuting a feature, we use the standard metric of our trained model, which is, in our case, the accuracy score. Using the same score enables us to evaluate the performance drop in relation to the baseline performance. We do 40 repetitions of permutation for each feature to get more reliable results.\n",
     "\n",
     "*Note: this method is a **global** method, which means that it only provides explanations for the full dataset but not for individual examples.*\n",
@@ -395,6 +393,14 @@
     "As mentioned before, permutation feature importance assumes feature independence. High correlation among features breaks this assumption and hence, can have an impact on the feature importance analysis."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "3f21ac5c",
+   "metadata": {},
+   "source": [
+    "---"
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "92f910a6",