Properly support iteration-based training

## 🚀 Feature
Currently, a lot of training functionality is tied to the concept of "Epoch".
IMHO, training based on epoch is bad practice, and I always train using the raw number of iterations.
PL makes this approach very difficult.


### Motivation
Currently, it is very difficult to use PL with iteration based training, e.g. Trainer currently resumes from checkpoint at the start of next epoch, rather than at the iteration on which checkpoint was made.

In the real world, data composition of the training set is changing pretty frequently and training based on epochs can give false results. e.g. if I create a new dataset by duplicating every item in my original, and train for the same number of epochs, I will probably get better result which might make me think that new data I added improved the results, but in reality I just trained for longer.

Training on very large datasets can take days/weeks. When resources are constrained I need to schedule time for each experiment, and having to manually calculate number of epochs that fit in a time period is very annoying.

### Pitch

Add proper support for iteration based training, the easiest would be to just keep track of total number of iterations through the epochs and allow various parts/callbacks to use it.

### Alternatives

You can also completely change to iteration based training, and calculate epochs using length of the dataset, but that would probably require quite a large rewrite and be about the same.

### Additional context
Current problems:

- [ ] Not possible to run validation at frequency greater than the length of epoch
- [ ] If the dataset is very large and training is only a few epochs, then resume can significantly worsen the results by starting from next epoch
- [ ] When `max_steps` is passed to `pl.Trainer` would be nice if it was used in `tqdm`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Properly support iteration-based training #7629

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Properly support iteration-based training #7629

Description

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions