default argparser #916

williamFalcon · 2020-02-23T10:57:36Z

🚀 Feature

Create a default argparser with all the properties that can go into a trainer

Motivation

People already do this themselves pretty often. Might as well make it easy for them

mtnwni · 2020-02-24T13:50:57Z

    def add_default_args(parent_parser):

        parser = ArgumentParser(parents=[parent_parser])

        # training, test, val check intervals
        parser.add_argument('--max_epochs', default=1000, type=int, help='maximum number of epochs')
        parser.add_argument('--min_epochs', default=1, type=int, help='minimum number of epochs')
        parser.add_argument('--max_steps', default=None, type=Optional[int],
                            help='stop training after this number of steps')
        parser.add_argument('--min_steps', default=None, type=Optional[int],
                            help='force training for atleast these number of steps')
        parser.add_argument('--check_val_every_n_epoch', default=1, type=int, help='check val every n epochs')
        parser.add_argument('--accumulate_grad_batches', default=1, type=Union[int, Dict[int, int]],
                            help='accumulates gradients k times before applying update.'
                            ' Simulates huge batch size')
        parser.add_argument('--train_percent_check', default=1.0, type=float,
                            help='how much of training set to check')
        parser.add_argument('--val_percent_check', default=1.0, type=float,
                            help='how much of val set to check')
        parser.add_argument('--test_percent_check', default=1.0, type=float,
                            help='how much of test set to check')

        parser.add_argument('--val_check_interval', default=1.0, type=Union[float],
                            help='how much within 1 epoch to check val')
        parser.add_argument('--log_save_interval', default=100, type=int,
                            help='how many batches between log saves')

        # early stopping
        parser.add_argument('--early_stop_callback', dest='early_stop_callback',
                            type=Optional[Union[EarlyStopping, bool]], default=None)

        # gradient handling
        parser.add_argument('--gradient_clip_val', default=0, type=float)
        parser.add_argument('--track_grad_norm', default=-1, type=int,
                            help='if > 0, will track this grad norm')
        parser.add_argument('--print_nan_grads', default=False, type=bool,
                            help='Prints gradients with nan values')

        # model
        parser.add_argument('--resume_from_checkpoint', default=None, type=Optional[str],
                            help='resumes training from a checkpoint')
        parser.add_argument('--checkpoint_callback', default=True, type=Union[ModelCheckpoint, bool],
                            help='callback for checkpointing')
        parser.add_argument('--truncated_bptt_steps', default=None, type=Optional[int],
                            help='Truncated back prop breaks performs backprop every k steps')
        parser.add_argument('--num_sanity_val_steps', default=5, type=int,
                            help='check runs n batches of val before starting the training')
        parser.add_argument('--process_position', default=0, type=int,
                            help='orders the tqdm bar')
        parser.add_argument('--show_progress_bar', default=True, type=bool,
                            help='If true shows tqdm progress bar')
        parser.add_argument('--distributed_backend', default=None, type=Optional[str],
                            help='The distributed backend to use')
        parser.add_argument('--weights_summary', default='full', type=str,
                            help='Prints a summary of the weights when training begins')
        parser.add_argument('--profiler', default=None, type=Optional[BaseProfiler],
                            help='To profile individual steps during training and assist in'
                            'identifying bottlenecks')

        # model path
        parser.add_argument('--default_save_path', default=None, type=Optional[str],
                            help='Default path for logs and weights')
        parser.add_argument('--weights_save_path', default=None, type=Optional[str],
                            help='Prints a summary of the weights when training begins')

        # GPU
        parser.add_argument('--gpus', default=None, type=Optional[Union[List[int], str, int]])
        parser.add_argument('--num_nodes', dest='num_nodes', type=int, default=1)
        parser.add_argument('--num_tpu_cores', default=None, type=Optional[int])
        parser.add_argument('--use_amp', dest='use_amp', default=False, type=bool)
        parser.add_argument('--check_grad_nans', dest='check_grad_nans', action='store_true')

        # Fast Training
        parser.add_argument('--fast_dev_run', dest='fast_dev_run', default=False, type=bool,
                            help='runs validation after 1 training step')
        parser.add_argument('--overfit_pct', default=0.0, type=float, dest='overfit_pct',
                            help='%% of dataset to use with this option. float, or -1 for none')

        # log
        parser.add_argument('--logger', default=True, type=Union[LightningLoggerBase, bool])
        parser.add_argument('--log_gpu_memory', default=None, type=Optional[str])
        parser.add_argument('--row_log_interval', default=10, type=int,
                            help='add log every k batches')

        return parser

Something like this as a static method of Trainer or as a util works?

Borda · 2020-02-24T23:11:22Z

@skepticleo looks good, could you send a PR? 🤖

XDynames · 2020-02-24T23:12:09Z

This would go really nicely with a classmethod/overload that constructs the Trainer from the passed arguments.

@classmethod
def from_default_args(self, args):  
    trainer = Trainer( <Unpack the args into constructor keywords>)  
    return trainer

Then the user's pattern becomes:

  args = parser.parse_args()  
  trainer = Trainer.from_default_args(args)

XDynames · 2020-03-02T09:07:47Z

In the arguments some of the default values are set to None to comply with the the Trainer's constructor. In my case I often pass and save my args into the LightningModule as self.hparams. When there are values of None associated with an argument in the Namespace object Tensorboard will raise a ValueError exception as it is not an int, float, str, bool or torch.tensor

By modifying the defaults to be acceptable alternatives (ie. for --gpus, default=0) this issue can be avoided (Assuming a work around exists currently for each Trainer parameter)

Borda · 2020-03-03T18:53:57Z

@XDynames could you check #1023 and send a PR with adjustments?

XDynames · 2020-03-04T00:22:18Z

@Borda Honestly I wasn't sure which way would be the best to address it, locally I have just added a dictionary to catch the cases that cause the issue, but that is hard to maintain

Ultimately it would be better to adjust the default arguments in the constructor from None to some other value, but I don't have enough scope to fully understand the implications of that for legacy support or constructor code

I also had a pending pull request on @skepticleo 's fork that parsed the constructor doc string to extract one line help messages for the arguments. Should I look to include this as well?

Borda · 2020-03-04T20:31:52Z

You can convert params to dictionary (temporary) and filter items to be not None...

XDynames · 2020-03-04T23:05:05Z

Sure, but then the user can't use some of the arguments like --gpus
If we actually deal with them case by case we end up having to store a mapping between None arguments and there equivalent to None defaults. Some examples of this would be:
gpus = 0
profiler = False
default_save_path = os.getcwd()

And then in some cases there are explicit checks for a NoneType in the constructor like for distributed_backend so there is no good default mapping for them other than None

So you end up in a position of filtering them before saving as hparams, but after instancing the Trainer (which causes you to lose some information about your settings) or not including them in the default argparser removing a large amount of common use from it, multi gpu settings, save paths, ect

I can't see a good bandaid here, it requires the removal of NoneTypes as defaults from the constructors signature and a repair of what that impacts

williamFalcon added feature Is an improvement or enhancement help wanted Open to be worked on labels Feb 23, 2020

Borda added the good first issue Good for newcomers label Feb 23, 2020

awaelchli mentioned this issue Feb 24, 2020

Add "epoch" options to basic templates #925

Closed

Borda assigned mtnwni Feb 24, 2020

Borda removed the help wanted Open to be worked on label Feb 24, 2020

baeseongsu mentioned this issue Feb 25, 2020

add epoch option to basic template #930

Closed

5 tasks

mtnwni mentioned this issue Feb 26, 2020

Default argparser for Trainer #952

Closed

5 tasks

williamFalcon closed this as completed Mar 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

default argparser #916

default argparser #916

williamFalcon commented Feb 23, 2020 •

edited by Borda

Loading

mtnwni commented Feb 24, 2020 •

edited

Loading

Borda commented Feb 24, 2020 •

edited

Loading

XDynames commented Feb 24, 2020 •

edited by Borda

Loading

XDynames commented Mar 2, 2020 •

edited

Loading

Borda commented Mar 3, 2020

XDynames commented Mar 4, 2020

Borda commented Mar 4, 2020

XDynames commented Mar 4, 2020 •

edited

Loading

default argparser #916

default argparser #916

Comments

williamFalcon commented Feb 23, 2020 • edited by Borda Loading

🚀 Feature

Motivation

mtnwni commented Feb 24, 2020 • edited Loading

Borda commented Feb 24, 2020 • edited Loading

XDynames commented Feb 24, 2020 • edited by Borda Loading

XDynames commented Mar 2, 2020 • edited Loading

Borda commented Mar 3, 2020

XDynames commented Mar 4, 2020

Borda commented Mar 4, 2020

XDynames commented Mar 4, 2020 • edited Loading

williamFalcon commented Feb 23, 2020 •

edited by Borda

Loading

mtnwni commented Feb 24, 2020 •

edited

Loading

Borda commented Feb 24, 2020 •

edited

Loading

XDynames commented Feb 24, 2020 •

edited by Borda

Loading

XDynames commented Mar 2, 2020 •

edited

Loading

XDynames commented Mar 4, 2020 •

edited

Loading