Skip to content

Commit

Permalink
(fix) Future attributes
Browse files Browse the repository at this point in the history
Some attributes were adapted so they dont give info about future, such
as driverWins, constructorWins, driverPosI, etc.
  • Loading branch information
pabramber01 committed Apr 2, 2024
1 parent edb2b9a commit f00c713
Show file tree
Hide file tree
Showing 42 changed files with 73,868 additions and 73,759 deletions.
51 changes: 49 additions & 2 deletions f1_prediction/adding_data/model_validation_add.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
"source": [
"from sklearn.preprocessing import LabelEncoder, RobustScaler\n",
"\n",
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"pd.set_option(\"display.max_columns\", None)\n",
Expand Down Expand Up @@ -119,7 +120,20 @@
" lambda x: x[\"driverRef\"] in x[\"driverRefPodium\"], axis=1\n",
")\n",
"df[\"driverPodiums\"] = df.groupby([\"raceYear\", \"driverRef\"])[\"driverIsPodium\"].cumsum()\n",
"df = df.drop([\"driverRefPodium\", \"driverIsPodium\"], axis=1)"
"df = df.drop([\"driverRefPodium\", \"driverIsPodium\"], axis=1)\n",
"\n",
"years = df[\"raceYear\"].drop_duplicates().to_numpy()\n",
"drivers = df[\"driverRef\"].drop_duplicates().to_numpy()\n",
"\n",
"for year in years:\n",
" for driver in drivers:\n",
" mask = (df[\"raceYear\"] == year) & (df[\"driverRef\"] == driver)\n",
" races = df.loc[mask, \"driverPodiums\"].iloc[:-1]\n",
" races.loc[-1] = 0\n",
" races.index += 1\n",
" races.sort_index(inplace=True)\n",
" races = races.to_numpy()\n",
" df.loc[mask, \"driverPodiums\"] = races"
]
},
{
Expand All @@ -143,7 +157,40 @@
" df[f\"driverPos{i}\"] = df.groupby([\"raceYear\", \"driverRef\"])[\n",
" f\"driverIsPos{i}\"\n",
" ].cumsum()\n",
" df = df.drop([f\"driverRefPos{i}\", f\"driverIsPos{i}\"], axis=1)"
" df = df.drop([f\"driverRefPos{i}\", f\"driverIsPos{i}\"], axis=1)\n",
"\n",
"years = df[\"raceYear\"].drop_duplicates().to_numpy()\n",
"drivers = df[\"driverRef\"].drop_duplicates().to_numpy()\n",
"\n",
"features = [\n",
" \"driverPos2\",\n",
" \"driverPos3\",\n",
" \"driverPos4\",\n",
" \"driverPos5\",\n",
" \"driverPos6\",\n",
" \"driverPos7\",\n",
" \"driverPos8\",\n",
" \"driverPos9\",\n",
" \"driverPos10\",\n",
" \"driverPos11\",\n",
" \"driverPos12\",\n",
" \"driverPos13\",\n",
" \"driverPos14\",\n",
" \"driverPos15\",\n",
" \"driverPos16\",\n",
" \"driverPos17\",\n",
" \"driverPos18\",\n",
"]\n",
"\n",
"for year in years:\n",
" for driver in drivers:\n",
" mask = (df[\"raceYear\"] == year) & (df[\"driverRef\"] == driver)\n",
" races = df.loc[mask, features].iloc[:-1]\n",
" races.loc[-1] = np.zeros(17, dtype=int)\n",
" races.index += 1\n",
" races.sort_index(inplace=True)\n",
" races = races.to_numpy()\n",
" df.loc[mask, features] = races"
]
},
{
Expand Down
214 changes: 113 additions & 101 deletions f1_prediction/adding_data/model_validation_add_pdm.ipynb

Large diffs are not rendered by default.

510 changes: 259 additions & 251 deletions f1_prediction/adding_data/model_validation_add_posvar.ipynb

Large diffs are not rendered by default.

236 changes: 114 additions & 122 deletions f1_prediction/adding_data/model_validation_all_pdm.ipynb

Large diffs are not rendered by default.

510 changes: 259 additions & 251 deletions f1_prediction/adding_data/model_validation_all_posvar.ipynb

Large diffs are not rendered by default.

196 changes: 90 additions & 106 deletions f1_prediction/adding_data/model_validation_cir_pdm.ipynb

Large diffs are not rendered by default.

490 changes: 249 additions & 241 deletions f1_prediction/adding_data/model_validation_cir_posvar.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion f1_prediction/adding_data/model_validation_drv.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -342,7 +342,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Encoding and normalization"
"## Encoding and normalization\n"
]
},
{
Expand Down
206 changes: 100 additions & 106 deletions f1_prediction/adding_data/model_validation_drv_pdm.ipynb

Large diffs are not rendered by default.

474 changes: 241 additions & 233 deletions f1_prediction/adding_data/model_validation_drv_posvar.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion f1_prediction/adding_data/model_validation_wea.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Encoding and normalization"
"## Encoding and normalization\n"
]
},
{
Expand Down
190 changes: 92 additions & 98 deletions f1_prediction/adding_data/model_validation_wea_pdm.ipynb

Large diffs are not rendered by default.

487 changes: 247 additions & 240 deletions f1_prediction/adding_data/model_validation_wea_posvar.ipynb

Large diffs are not rendered by default.

13,102 changes: 6,551 additions & 6,551 deletions f1_prediction/assets/data/processed/adding_data.csv

Large diffs are not rendered by default.

14,470 changes: 7,235 additions & 7,235 deletions f1_prediction/assets/data/processed/adding_data_X.csv

Large diffs are not rendered by default.

13,102 changes: 6,551 additions & 6,551 deletions f1_prediction/assets/data/processed/additional.csv

Large diffs are not rendered by default.

14,470 changes: 7,235 additions & 7,235 deletions f1_prediction/assets/data/processed/additional_X.csv

Large diffs are not rendered by default.

4,104 changes: 2,052 additions & 2,052 deletions f1_prediction/assets/data/processed/base_model.csv

Large diffs are not rendered by default.

14,470 changes: 7,235 additions & 7,235 deletions f1_prediction/assets/data/processed/base_model_X.csv

Large diffs are not rendered by default.

4,104 changes: 2,052 additions & 2,052 deletions f1_prediction/assets/data/processed/circuit.csv

Large diffs are not rendered by default.

14,470 changes: 7,235 additions & 7,235 deletions f1_prediction/assets/data/processed/circuit_X.csv

Large diffs are not rendered by default.

13,102 changes: 6,551 additions & 6,551 deletions f1_prediction/assets/data/processed/driver_ratings.csv

Large diffs are not rendered by default.

14,470 changes: 7,235 additions & 7,235 deletions f1_prediction/assets/data/processed/driver_ratings_X.csv

Large diffs are not rendered by default.

4,104 changes: 2,052 additions & 2,052 deletions f1_prediction/assets/data/processed/missing_values.csv

Large diffs are not rendered by default.

4,104 changes: 2,052 additions & 2,052 deletions f1_prediction/assets/data/processed/weather.csv

Large diffs are not rendered by default.

14,470 changes: 7,235 additions & 7,235 deletions f1_prediction/assets/data/processed/weather_X.csv

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions f1_prediction/base_model/advanced_tuning.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Advanced hyperparamenter tuning"
"# Advanced hyperparamenter tuning\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tuning methods"
"## Tuning methods\n"
]
},
{
Expand Down
59 changes: 26 additions & 33 deletions f1_prediction/base_model/advanced_tuning_pdm.ipynb

Large diffs are not rendered by default.

54 changes: 28 additions & 26 deletions f1_prediction/base_model/advanced_tuning_posvar.ipynb

Large diffs are not rendered by default.

76 changes: 40 additions & 36 deletions f1_prediction/base_model/feature_selection_pdm.ipynb

Large diffs are not rendered by default.

85 changes: 42 additions & 43 deletions f1_prediction/base_model/feature_selection_posvar.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion f1_prediction/base_model/model_validation.ipynb

Large diffs are not rendered by default.

94 changes: 47 additions & 47 deletions f1_prediction/base_model/model_validation_pdm.ipynb

Large diffs are not rendered by default.

338 changes: 169 additions & 169 deletions f1_prediction/base_model/model_validation_pos.ipynb

Large diffs are not rendered by default.

162 changes: 81 additions & 81 deletions f1_prediction/base_model/model_validation_posfxd.ipynb

Large diffs are not rendered by default.

394 changes: 197 additions & 197 deletions f1_prediction/base_model/model_validation_posvar.ipynb

Large diffs are not rendered by default.

94 changes: 47 additions & 47 deletions f1_prediction/base_model/model_validation_wnr.ipynb

Large diffs are not rendered by default.

20 changes: 10 additions & 10 deletions f1_prediction/base_model/simple_tuning_pdm.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"KNeighborsClassifier: 0.726096099171753 with {'metric': 'cosine', 'n_neighbors': 5,\n",
"KNeighborsClassifier: 0.7215047203351547 with {'metric': 'cosine', 'n_neighbors': 5,\n",
"\t'weights': 'distance'}\n"
]
}
Expand Down Expand Up @@ -155,8 +155,8 @@
"name": "stdout",
"output_type": "stream",
"text": [
"DecisionTreeClassifier: 0.784020180449896 with {'criterion': 'gini', 'max_depth': 3,\n",
"\t'splitter': 'best'}\n"
"DecisionTreeClassifier: 0.7919333040678084 with {'criterion': 'log_loss', 'max_depth':\n",
"\t4, 'splitter': 'best'}\n"
]
}
],
Expand Down Expand Up @@ -193,8 +193,8 @@
"name": "stdout",
"output_type": "stream",
"text": [
"RandomForestClassifier: 0.7735981616295589 with {'criterion': 'gini', 'max_depth': 10,\n",
"\t'n_estimators': 50}\n"
"RandomForestClassifier: 0.7647004284290003 with {'criterion': 'gini', 'max_depth': 10,\n",
"\t'n_estimators': 200}\n"
]
}
],
Expand Down Expand Up @@ -231,7 +231,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"MLPClassifier: 0.8008305374733512 with {'activation': 'logistic', 'hidden_layer_sizes':\n",
"MLPClassifier: 0.797863262800811 with {'activation': 'logistic', 'hidden_layer_sizes':\n",
"\t(50, 20, 5)}\n"
]
}
Expand Down Expand Up @@ -265,10 +265,10 @@
"source": [
"After viewing several runs, the hyperparameters for each algorithm are as follows\n",
"\n",
"- KNeighborsClassifier: 0.7263922435898924 with {'metric': 'cosine', 'n_neighbors': 5, 'weights': 'distance'}\n",
"- DecisionTreeClassifier: 0.7892538259292045 with {'criterion': 'gini', 'max_depth': 3, 'splitter': 'best'}\n",
"- RandomForestClassifier: 0.7778548164328474 with {'criterion': 'gini', 'max_depth': 10, 'n_estimators': 200}\n",
"- MLPClassifier: 0.799283305036914 with {'activation': 'logistic', 'hidden_layer_sizes': (50, 20, 5)}\n"
"- KNeighborsClassifier: 0.7215047203351547 with {'metric': 'cosine', 'n_neighbors': 5, 'weights': 'distance'}\n",
"- DecisionTreeClassifier: 0.7919333040678084 with {'criterion': 'log_loss', 'max_depth': 4, 'splitter': 'best'}\n",
"- RandomForestClassifier: 0.7647004284290003 with {'criterion': 'gini', 'max_depth': 10, 'n_estimators': 200}\n",
"- MLPClassifier: 0.797863262800811 with {'activation': 'logistic', 'hidden_layer_sizes': (50, 20, 5)}\n"
]
}
],
Expand Down
27 changes: 12 additions & 15 deletions f1_prediction/base_model/simple_tuning_posvar.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,7 @@
"sys.path.append(\"..\")\n",
"\n",
"from utils.custom_cvs import VariableTimeSeriesSplit\n",
"from utils.custom_scorers import (\n",
" balanced_accuracy_1interval_score,\n",
" mean_absolute_1interval_error,\n",
")\n",
"from utils.custom_scorers import balanced_accuracy_1interval_score\n",
"\n",
"import textwrap\n",
"import numpy as np\n",
Expand Down Expand Up @@ -114,8 +111,8 @@
"name": "stdout",
"output_type": "stream",
"text": [
"KNeighborsClassifier: 0.28718245979609613 with {'metric': 'manhattan', 'n_neighbors':\n",
"\t101, 'weights': 'distance'}\n"
"KNeighborsClassifier: 0.286227602363966 with {'metric': 'manhattan', 'n_neighbors': 101,\n",
"\t'weights': 'uniform'}\n"
]
}
],
Expand Down Expand Up @@ -152,8 +149,8 @@
"name": "stdout",
"output_type": "stream",
"text": [
"DecisionTreeClassifier: 0.3389971896790078 with {'criterion': 'entropy', 'max_depth': 5,\n",
"\t'splitter': 'random'}\n"
"DecisionTreeClassifier: 0.3423765333992606 with {'criterion': 'entropy', 'max_depth': 4,\n",
"\t'splitter': 'best'}\n"
]
}
],
Expand Down Expand Up @@ -190,8 +187,8 @@
"name": "stdout",
"output_type": "stream",
"text": [
"RandomForestClassifier: 0.34932197852652397 with {'criterion': 'gini', 'max_depth': 5,\n",
"\t'n_estimators': 50}\n"
"RandomForestClassifier: 0.3478434171047807 with {'criterion': 'log_loss', 'max_depth':\n",
"\t5, 'n_estimators': 200}\n"
]
}
],
Expand Down Expand Up @@ -228,7 +225,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"MLPClassifier: 0.35292989628216903 with {'activation': 'logistic', 'hidden_layer_sizes':\n",
"MLPClassifier: 0.3517574344847072 with {'activation': 'logistic', 'hidden_layer_sizes':\n",
"\t(50, 25)}\n"
]
}
Expand Down Expand Up @@ -262,10 +259,10 @@
"source": [
"After viewing several runs, the hyperparameters for each algorithm are as follows\n",
"\n",
"- KNeighborsClassifier: 0.2863537094218912 with {'metric': 'manhattan', 'n_neighbors': 101, 'weights': 'uniform'}\n",
"- DecisionTreeClassifier: 0.34112213333804237 with {'criterion': 'gini', 'max_depth': 4, 'splitter': 'best'}\n",
"- RandomForestClassifier: 0.3479073809187445 with {'criterion': 'gini', 'max_depth': 5, 'n_estimators': 200}\n",
"- MLPClassifier: 0.3586966547762002 with {'activation': 'logistic', 'hidden_layer_sizes': (50, 25)}\n"
"- KNeighborsClassifier: 0.286227602363966 with {'metric': 'manhattan', 'n_neighbors': 101, 'weights': 'uniform'}\n",
"- DecisionTreeClassifier: 0.3413275823503096 with {'criterion': 'entropy', 'max_depth': 4, 'splitter': 'best'}\n",
"- RandomForestClassifier: 0.34519032104259373 with {'criterion': 'gini', 'max_depth': 5, 'n_estimators': 200}\n",
"- MLPClassifier: 0.3515771497021497 with {'activation': 'logistic', 'hidden_layer_sizes': (50, 25)}\n"
]
}
],
Expand Down
8 changes: 4 additions & 4 deletions f1_prediction/preprocessing/data_selection.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Data selection"
"# Data selection\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dependencies"
"## Dependencies\n"
]
},
{
Expand Down Expand Up @@ -41,7 +41,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Merging data"
"## Merging data\n"
]
},
{
Expand Down Expand Up @@ -202,7 +202,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Renaming data"
"## Renaming data\n"
]
},
{
Expand Down
6 changes: 3 additions & 3 deletions f1_prediction/preprocessing/data_transformation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Data transformation"
"# Data transformation\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dependencies"
"## Dependencies\n"
]
},
{
Expand Down Expand Up @@ -43,7 +43,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Datatype conversion"
"## Datatype conversion\n"
]
},
{
Expand Down
Loading

0 comments on commit f00c713

Please sign in to comment.