-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement optuna hyperparameter optimization #278
Closed
Closed
Changes from 9 commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
1bbd7f1
config/hydra/default.yaml: implement optuna hyperparameter optimizati…
remtav 9105692
critical log when missing key for metrics.py
remtav d200aac
Merge branch 'develop' into 243-hyperparam-opt
remtav 7112c4a
remove models: (#289)
remtav 8831708
Refactor and encapsulate deeplabv3 with nir injection from model_choi…
remtav 10ddc7a
- add link to optuna-with-hydra documentation
remtav af72927
bugfix for model name with optuna
remtav 1c5732c
another bugfix for model name with optuna
remtav ec70859
Merge branch 'develop' into 243-hyperparam-opt
remtav c882445
bugfix for github actions
remtav 72bdd6b
model_choice.py: remove if __name__ == '__main__'. Already in unit tests
remtav File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# @package _global_ | ||
optimizer: | ||
optimizer_name: 'adabound' | ||
class_name: utils.adabound.AdaBound | ||
params: | ||
lr: ${training.lr} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,166 +0,0 @@ | ||
"""Hyperparamater optimization for GDL using hyperopt | ||
|
||
This is a template for using hyperopt with GDL. The my_space variable currently needs to | ||
be modified here, as well as GDL config modification logic within the objective_with_args | ||
function. | ||
|
||
""" | ||
|
||
import argparse | ||
from pathlib import Path | ||
import pickle | ||
from functools import partial | ||
import pprint | ||
import numpy as np | ||
|
||
from ruamel_yaml import YAML | ||
import mlflow | ||
import torch | ||
# ToDo: Add hyperopt to GDL requirements | ||
from hyperopt import fmin, tpe, hp, Trials, STATUS_OK | ||
|
||
from train_segmentation import main as train_main | ||
|
||
# This is the hyperparameter space to explore | ||
my_space = {'model_name': hp.choice('model_name', ['unet_pretrained', 'deeplabv3_resnet101']), | ||
'loss_fn': hp.choice('loss_fn', ['CrossEntropy', 'Lovasz', 'Duo']), | ||
'optimizer': hp.choice('optimizer', ['adam', 'adabound']), | ||
'learning_rate': hp.loguniform('learning_rate', np.log(1e-7), np.log(0.1))} | ||
|
||
|
||
def read_parameters(param_file): | ||
"""Read and return parameters in .yaml file | ||
Args: | ||
param_file: Full file path of the parameters file | ||
Returns: | ||
YAML (Ruamel) CommentedMap dict-like object | ||
""" | ||
yaml = YAML() | ||
with open(param_file) as yamlfile: | ||
params = yaml.load(yamlfile) | ||
return params | ||
|
||
|
||
def get_latest_mlrun(params): | ||
"""Get latest mlflow run | ||
|
||
:param params: gdl parameters dictionary | ||
:return: mlflow run object | ||
""" | ||
|
||
tracking_uri = params['global']['mlflow_uri'] | ||
mlflow.set_tracking_uri(tracking_uri) | ||
mlexp = mlflow.get_experiment_by_name(params['global']['mlflow_experiment_name']) | ||
exp_id = mlexp.experiment_id | ||
try: | ||
run_ids = ([x.run_id for x in mlflow.list_run_infos( | ||
exp_id, max_results=1, order_by=["tag.release DESC"])]) | ||
except AttributeError: | ||
mlflow_client = mlflow.tracking.MlflowClient(tracking_uri=tracking_uri) | ||
run_ids = [x.run_id for x in mlflow_client.list_run_infos(exp_id, run_view_type=3)[0:1]] | ||
mlrun = mlflow.get_run(run_ids[0]) | ||
return mlrun | ||
|
||
|
||
def objective_with_args(hparams, params, config_path): | ||
"""Objective function for hyperopt | ||
|
||
This function edits the GDL parameters and runs a training. | ||
|
||
:param hparams: arguments provided by hyperopt selection from hyperparameter space | ||
:param params: gdl parameters dictionary | ||
:param config_path: path to gdl configuration file | ||
:return: loss dictionary for hyperopt | ||
""" | ||
|
||
# ToDo: This is dependent on the specific structure of the GDL config file | ||
params['global']['model_name'] = hparams['model_name'] | ||
# params['training']['target_size'] = hparams['target_size'] | ||
params['training']['loss_fn '] = hparams['loss_fn'] | ||
params['training']['optimizer'] = hparams['optimizer'] | ||
params['training']['learning_rate'] = hparams['learning_rate'] | ||
|
||
try: | ||
mlrun = get_latest_mlrun(params) | ||
run_name_split = mlrun.data.tags['mlflow.runName'].split('_') | ||
params['global']['mlflow_run_name'] = run_name_split[0] + f'_{int(run_name_split[1]) + 1}' | ||
except: | ||
pass | ||
train_main(params, config_path) | ||
torch.cuda.empty_cache() | ||
|
||
mlflow.end_run() | ||
mlrun = get_latest_mlrun(params) | ||
|
||
# ToDo: Probably need some cleanup to avoid accumulating results on disk | ||
|
||
# ToDo: This loss should be configurable | ||
return {'loss': -mlrun.data.metrics['tst_iou'], 'status': STATUS_OK} | ||
|
||
|
||
def trials_to_csv(trials, csv_pth): | ||
"""hyperopt trials to CSV | ||
|
||
:param trials: hyperopt trials object | ||
""" | ||
|
||
params = sorted(list(trials.vals.keys())) | ||
csv_str = '' | ||
for param in params: | ||
csv_str += f'{param}, ' | ||
csv_str = csv_str + 'loss' + '\n' | ||
|
||
for i in range(len(trials.trials)): | ||
for param in params: | ||
if my_space[param].name == 'switch': | ||
csv_str += f'{my_space[param].pos_args[trials.vals[param][i] + 1].obj}, ' | ||
else: | ||
csv_str += f'{trials.vals[param][i]}, ' | ||
csv_str = csv_str + f'{trials.results[i]["loss"]}' + '\n' | ||
|
||
# ToDo: Customize where the csv output is | ||
with open(csv_pth, 'w') as csv_obj: | ||
csv_obj.write(csv_str) | ||
|
||
|
||
def main(params, config_path): | ||
# ToDo: Customize where the trials file is | ||
# ToDo: Customize where the trials file is | ||
root_path = Path(params['global']['assets_path']) | ||
pkl_file = root_path.joinpath('hyperopt_trials.pkl') | ||
csv_file = root_path.joinpath('hyperopt_results.csv') | ||
if pkl_file.is_file(): | ||
trials = pickle.load(open(pkl_file, "rb")) | ||
else: | ||
trials = Trials() | ||
|
||
objective = partial(objective_with_args, params=params, config_path=config_path) | ||
|
||
n = 0 | ||
while n < params['global']['hyperopt_runs']: | ||
best = fmin(objective, | ||
space=my_space, | ||
algo=tpe.suggest, | ||
trials=trials, | ||
max_evals=n + params['global']['hyperopt_delta']) | ||
n += params['global']['hyperopt_delta'] | ||
pickle.dump(trials, open(pkl_file, "wb")) | ||
|
||
# ToDo: Cleanup the output | ||
pprint.pprint(trials.vals) | ||
pprint.pprint(trials.results) | ||
for key, val in best.items(): | ||
if my_space[key].name == 'switch': | ||
best[key] = my_space[key].pos_args[val + 1].obj | ||
pprint.pprint(best) | ||
print(trials.best_trial['result']) | ||
trials_to_csv(trials, csv_file) | ||
|
||
|
||
if __name__ == '__main__': | ||
parser = argparse.ArgumentParser(description='Geo Deep Learning hyperopt') | ||
parser.add_argument('param_file', type=str, help='Path of gdl config file') | ||
args = parser.parse_args() | ||
gdl_params = read_parameters(args.param_file) | ||
gdl_params['self'] = {'config_file': args.param_file} | ||
main(gdl_params, Path(args.param_file)) | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -360,6 +360,8 @@ def evaluation(eval_loader, | |
:param debug: if True, debug functions will be performed | ||
:return: (dict) eval_metrics | ||
""" | ||
dontcare = criterion.ignore_index if hasattr(criterion, 'ignore_index') else -1 | ||
|
||
eval_metrics = create_metrics_dict(num_classes) | ||
model.eval() | ||
|
||
|
@@ -409,12 +411,13 @@ def evaluation(eval_loader, | |
a, segmentation = torch.max(outputs_flatten, dim=1) | ||
eval_metrics = iou(segmentation, labels_flatten, batch_size, num_classes, eval_metrics) | ||
eval_metrics = report_classification(segmentation, labels_flatten, batch_size, eval_metrics, | ||
ignore_index=eval_loader.dataset.dontcare) | ||
elif (dataset == 'tst') and (batch_metrics is not None): | ||
ignore_index=dontcare) | ||
elif dataset == 'tst': | ||
batch_metrics = True | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. force metrics at test time. I don't see any reason why we shouldn'y systematically output metrics at test time. |
||
a, segmentation = torch.max(outputs_flatten, dim=1) | ||
eval_metrics = iou(segmentation, labels_flatten, batch_size, num_classes, eval_metrics) | ||
eval_metrics = report_classification(segmentation, labels_flatten, batch_size, eval_metrics, | ||
ignore_index=eval_loader.dataset.dontcare) | ||
ignore_index=dontcare) | ||
|
||
logging.debug(OrderedDict(dataset=dataset, loss=f'{eval_metrics["loss"].avg:.4f}')) | ||
|
||
|
@@ -428,10 +431,11 @@ def evaluation(eval_loader, | |
if eval_metrics['loss'].avg: | ||
logging.info(f"\n{dataset} Loss: {eval_metrics['loss'].avg:.4f}") | ||
if batch_metrics is not None: | ||
logging.info(f"\n{dataset} precision: {eval_metrics['precision'].avg}") | ||
logging.info(f"\n{dataset} recall: {eval_metrics['recall'].avg}") | ||
logging.info(f"\n{dataset} fscore: {eval_metrics['fscore'].avg}") | ||
logging.info(f"\n{dataset} iou: {eval_metrics['iou'].avg}") | ||
logging.info(f"\n{dataset} precision: {eval_metrics['precision'].avg}" | ||
f"\n{dataset} recall: {eval_metrics['recall'].avg}" | ||
f"\n{dataset} fscore: {eval_metrics['fscore'].avg}" | ||
f"\n{dataset} iou: {eval_metrics['iou'].avg}" | ||
f"\n{dataset} iou (non background): {eval_metrics['iou_nonbg'].avg}") | ||
|
||
return eval_metrics | ||
|
||
|
@@ -742,6 +746,7 @@ def train(cfg: DictConfig) -> None: | |
checkpoint = load_checkpoint(filename) | ||
model, _ = load_from_checkpoint(checkpoint, model) | ||
|
||
return_metric = None | ||
if tst_dataloader: | ||
tst_report = evaluation(eval_loader=tst_dataloader, | ||
model=model, | ||
|
@@ -763,9 +768,13 @@ def train(cfg: DictConfig) -> None: | |
bucket.upload_file("output.txt", bucket_output_path.joinpath(f"Logs/{now}_output.txt")) | ||
bucket.upload_file(filename, bucket_filename) | ||
|
||
return_metric = tst_report['iou'].avg | ||
|
||
# log_artifact(logfile) | ||
# log_artifact(logfile_debug) | ||
|
||
return return_metric | ||
|
||
|
||
def main(cfg: DictConfig) -> None: | ||
""" | ||
|
@@ -790,4 +799,5 @@ def main(cfg: DictConfig) -> None: | |
# HERE the code to do for the preprocessing for the segmentation | ||
|
||
# execute the name mode (need to be in this file for now) | ||
train(cfg) | ||
tst_iou = train(cfg) | ||
return tst_iou |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixes missing attribute bug. Dontcare value was previously stored in dataloader object, but not anymore (since when?)