Skip to content

Commit

Permalink
(feat) Final model
Browse files Browse the repository at this point in the history
Final model with all improvements was added, along with some fixes.
  • Loading branch information
pabramber01 committed Apr 12, 2024
1 parent 8d9e50a commit 62d07f4
Show file tree
Hide file tree
Showing 29 changed files with 16,484 additions and 16 deletions.
19 changes: 19 additions & 0 deletions _toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -110,3 +110,22 @@ parts:
sections:
- file: f1_prediction/other_models/advanced_tuning_posvar
- file: f1_prediction/other_models/advanced_tuning_pdm

- caption: Final model
chapters:
- file: f1_prediction/final_model/model_validation
sections:
- file: f1_prediction/final_model/model_validation_posvar
- file: f1_prediction/final_model/model_validation_pdm
- file: f1_prediction/final_model/simple_tuning
sections:
- file: f1_prediction/final_model/simple_tuning_posvar
- file: f1_prediction/final_model/simple_tuning_pdm
- file: f1_prediction/final_model/feature_selection
sections:
- file: f1_prediction/final_model/feature_selection_posvar
- file: f1_prediction/final_model/feature_selection_pdm
- file: f1_prediction/final_model/advanced_tuning
sections:
- file: f1_prediction/final_model/advanced_tuning_posvar
- file: f1_prediction/final_model/advanced_tuning_pdm
2 changes: 1 addition & 1 deletion f1_prediction/adding_data/feature_selection_posvar.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"First we will do the tuning of the model that predicts the final position of each driver at a ±1 interval.\n"
"First we will do the selection of the model that predicts the final position of each driver at a ±1 interval.\n"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"First we will do the tuning of the model that predicts the final position of each driver at a ±1 interval.\n"
"First we will do the validation of the model that predicts the final position of each driver at a ±1 interval.\n"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"First we will do the tuning of the model that predicts the final position of each driver at a ±1 interval.\n"
"First we will do the validation of the model that predicts the final position of each driver at a ±1 interval.\n"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"First we will do the tuning of the model that predicts the final position of each driver at a ±1 interval.\n"
"First we will do the validation of the model that predicts the final position of each driver at a ±1 interval.\n"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"First we will do the tuning of the model that predicts the final position of each driver at a ±1 interval.\n"
"First we will do the validation of the model that predicts the final position of each driver at a ±1 interval.\n"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"First we will do the tuning of the model that predicts the final position of each driver at a ±1 interval.\n"
"First we will do the validation of the model that predicts the final position of each driver at a ±1 interval.\n"
]
},
{
Expand Down
5,740 changes: 5,740 additions & 0 deletions f1_prediction/assets/data/processed/final_model.csv

Large diffs are not rendered by default.

5,740 changes: 5,740 additions & 0 deletions f1_prediction/assets/data/processed/final_model_X.csv

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion f1_prediction/base_model/feature_selection_posvar.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"First we will do the tuning of the model that predicts the final position of each driver at a ±1 interval.\n"
"First we will do the selection of the model that predicts the final position of each driver at a ±1 interval.\n"
]
},
{
Expand Down
32 changes: 32 additions & 0 deletions f1_prediction/final_model/advanced_tuning.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Advanced hyperparamenter tuning\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tuning methods\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After feature selection, we proceed to advanced hyperparameter tuning. For this purpose, optuna, an automatic hyperparameter optimization software framework specially designed for machine learning, will be used.\n"
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
621 changes: 621 additions & 0 deletions f1_prediction/final_model/advanced_tuning_pdm.ipynb

Large diffs are not rendered by default.

727 changes: 727 additions & 0 deletions f1_prediction/final_model/advanced_tuning_posvar.ipynb

Large diffs are not rendered by default.

112 changes: 112 additions & 0 deletions f1_prediction/final_model/feature_selection.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Feature selection\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dependencies\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The dependencies used are as follows\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import cross_val_score"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Selection methods\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After hyperparameter tuning, we proceed to attribute selection. Three methods will be used for this\n",
"\n",
"- To see the most important attributes, PermutationImportance\n",
"- For a more exhaustive search, SequentialForwardSelector\n",
"- For a more stochastic search, GeneticAlgorithms\n",
"\n",
"The main method will be Sequential forward selection (SFS), in which features are sequentially added to an empty candidate set until the addition of more features does not lower the criterion.\n",
"\n",
"PermutationImportance we will use it to corroborate the results, as well as to see which attributes contribute the most to the performance of the model. The performance obtained by this measure and the previous one may differ because, even if one measure is of little relevance by itself, combined with others it can improve the model significantly.\n",
"\n",
"Finally, we will use genetic algorithms to check with a small stochastic search if there is a possibility that there are other combinations that improve performance. This is because SequentialForwardSelector adds measures starting from one measure, i.e., it does not check all combinations and there may be a better one. Regarding the genetic algorithm itself, the fitness function will correspond to the cross-validation of a binary individual, where a 1 in position i will represent that measure i is taken for the evaluation, and if it is 0 it is not.\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"def fitness_func(ga_instance, individual, individual_idx):\n",
" res = []\n",
"\n",
" if rank:\n",
" individual[X.columns.get_loc(\"qid\")] = 1\n",
"\n",
" idx = [i for i in range(len(individual)) if individual[i] == 1]\n",
"\n",
" attributes = X.iloc[:, idx]\n",
" objective = y\n",
"\n",
" if not attributes.empty:\n",
" res.extend(\n",
" cross_val_score(\n",
" estimator=estimator,\n",
" X=attributes,\n",
" y=objective,\n",
" cv=tscv,\n",
" scoring=scor,\n",
" n_jobs=-1,\n",
" )\n",
" )\n",
"\n",
" avg_cross_val_score = sum(res) / len(res) if not attributes.empty else -10000\n",
" return avg_cross_val_score"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
566 changes: 566 additions & 0 deletions f1_prediction/final_model/feature_selection_pdm.ipynb

Large diffs are not rendered by default.

598 changes: 598 additions & 0 deletions f1_prediction/final_model/feature_selection_posvar.ipynb

Large diffs are not rendered by default.

Loading

0 comments on commit 62d07f4

Please sign in to comment.