Hyperopt loss #1726

APJansen · 2023-05-04T12:49:24Z

Improving hyperoptimization, experimenting with different hyperoptimization loss functions

Tasks done in this PR

Added a new HyperLossclass with buit-in methods that can automatically perform statistics over replicas and then folds. The user can select statistics via replica_statistic and fold_statistic in runcard kfold.
Added new a statistics over replicas, $\varphi^2_{k}$, that (in addition to $\chi^{2}$) can be selected via loss_type option in runcard kfold.
Addressed Printing Multi-statistics to hyperopt json files #1894. Here, regardless the selected metrics to minimise by Hyperopt ($\varphi^2$ or $\chi^{2}$), we also print in tries.json (specifically within kfold_meta entry) a matrix (folds x replicas) of calculated $\chi^{2}$ values named hyper_losses_chi2 and a vector (folds) of $\varphi^2$ values named hyper_losses_phi2.

Description

The implemented HyperLoss is instantiated within ModelTrainer and later on used in ModelTrainer.hyperparametrizable. The user must pass three paramaters that are set in the runcard:

loss_type: The type of loss to be used. Options are chi2 or phi2.
replica_statistic: The statistics over replicas to be used within each fold. For loss_type = chi2, it can assume the usual statistics: average, best_worst and std. Note: replica_statistic is inactive if loss_type = phi2 as $\varphi^2_{k}$ is by definition a statistics over replicas.
fold_statistic: The statistics over folds. Options are: average, best_worst and std.
As discussed by @juanrojochacon, the calculation of any $\varphi^2$ statistics over folds is done with the reciprocal of the chosen function. For example, for loss_type: phi2 and fold_statistic: average, the figure of merit to be minimised is actually:
$$\Huge \left( \frac{1}{n_\text{fold}} \sum_{k=1}^{n_\text{fold}} \varphi^2_{k} \right)^{-1} $$

The current implementation of $\varphi^2_{k}$ is based on validphys functions. It is evaluated using only experimental data within the hold out fold (as expected).

Runcard examples

Default run: hyper_loss is set as the $\chi^2$ averaged over replicas and then over folds.

kfold:
      loss_type: chi2
      replica_statistic: average
      fold_statistic: average
      penalties:
        - saturation
        - patience
        - integrability
...

Setting $\varphi^2_{k}$ within each $k$-fold and then set hyper_loss as the inverse of the max value of $[\varphi^2_{1}, \varphi^2_{2}, ..., \varphi^2_{k}]$:

kfold:
      loss_type: phi2
      fold_statistic: best_worst
      penalties:
        - saturation
        - patience
        - integrability
...

Notes

It must be merged after #1788 as the current hyperopt_loss branch has been created from trvl-mask-layers.

APJansen · 2023-09-06T10:38:40Z

Description of what happens with hyperoptimization

For the benefit of my future self and my colleagues @goord and @Cmurilochem I describe here exactly what happens during hyperoptimization. It mainly describes how it works in master, so far this branch only changes it on two points, as indicated.

Single Trial

A "trial" is the entire evaluation of one set of hyperparameters. One trial in the code is one execution of the function ModelTrainer.hyperparametrizable.

Folds

Trials are scored by their hyper_loss, a validation loss computed using k-folding.
This means that the datasets are partitioned into k (=4) parts. This is done manually to ensure that all folds cover the same kinematic range. It is configured in the runcard.

For each fold, the datasets in that fold are left out completely, and the model is trained as normal on the remaining k-1 folds.
Construction of the models for each fold happens here. The training and validation model both use the datasets in the k-1 remaining folds, and have a different randomized per-dataset mask. The experimental model uses the held out fold, without any mask. It is this model that determines the hyper_loss.

The model is trained on the k-1 folds (masked with the training mask).

Early exit 1
It is said to have passed if any replica has a best epoch. This will only be false if for all replicas and for all epochs, the
validation chi2 is worse than at the previous epoch. (technically, if passes here is always false). If this fails, the trial is exited early and given an infinite loss.

Then the hyper_losses, per replica are computed, as the loss on the held out fold, and penalties added (this branch fixes this to be per replica, it was being overcounted).

Early exit 2
If the mean per replica is above a threshold set in the runcard (the threshold key under kfold), the trial is again exited early and given an infinite loss.

Loss

We get l_hyper, an array of losses of shape (folds, replicas).
This is passed into a HyperLoss object (implemented in this branch), which computes the specified statistics first over replicas, then over folds.
This is then put into the loss key of the returned dictionary.

Managing trials

Results are saved in runcard/nnfit/replica_1/tries.json, for all replicas (other replica folders are empty). It is a list of dictionaries, one for each trial. The entire dictionary returned by hyperparametrizable is under the "result" key.

The trials start from the function hyper_scan_wrapper, called in performfit. The best parameters are printed, and it seems retraining a model with those parameters is to be done manually.

Inside hyper_scan_wrapper, the hyperopt library's fmin function is called.
The hyperoptimization algorithm used is a Tree structured Parzen Estimator (TPE), section 4 of this paper.
It can handle hierarchical hyperparameters, i.e. the number of units needs to be chosen for each layer, and the number of layers can vary too. I didn't read in any detail, but it does use previous results to suggest new trials.

possible issue

on a test with 2 trials varying only the nodes per layer, it tested (27,12,8) and then (30,36,8), but the best parameters printed were the default (25,20,8). I'm not sure what exactly happens in this final call to space_eval.

Some thoughts on improvements

Parallelizing trials, checkpointing

Trials can be run in parallel using MongoDB, as described here.
It seems this could also be used to enable continuing from a previous run as discussed in issue #1800.

K-folding

When using K-folding, we now have 3 classes of data at every fold:

training data: in the k-1 folds that ARE used, seen during training
validation data: in the k-1 folds that ARE used, not seen during training
"hyper-validation" (not sure what to call it) data: in the held out fold

Both 2 and 3 are not seen during training. The difference in how they are used is that 2. is used for the "early exit 1" above, while 3 is used to compute the hyper loss. If that was all, I'd say 2 is wasted as this condition is very unlikely to trigger even with one replica.
But actually 2 is also used for early stopping. So we do want to keep all 3, as combining 2 and 3 would introduce a bias.

We also discussed a long time ago that perhaps K-folding won't be necessary anymore. The point of it is to have a more accurate validation loss, but if we run with say 100 replicas, it is already more accurate. We would still need 3 groups of data, to avoid the bias of joining 2 and 3, however they could be split randomly per dataset, (per replica). The advantage would be that it'd require only a single run per trial (slightly bigger, but still nearly 4x speedup), and simplify the code. Not sure though what the effect on the learning would be.

Code improvements

It would be nice to move as much as possible of ModelTrainer.hyperparametrizable into one of the hyperopt modules
models are recreated from scratch every fold, PDFs only need to be re-initialized. In fact the PDFs could be generated only once per entire hyperopt run (and copied and reinitialized), and the observables only once per fold, and copied and re-initialized for each trial.
It seems that the best hyperparameters are just printed, and not saved anywhere?
Maybe we want to add (optionally) rerunning a training run without K-folding using the best parameters.

Cmurilochem · 2023-11-22T12:09:54Z

Implementation of a new metrics

As a continuation of the work by @APJansen, I intent to add a new hyperoptimization metrics to the HyperLoss class , $\varphi^{2}$ [Eq.(4.6) of the NNPDF3.0 paper and #1849 ]. $\varphi^{2}$ is defined for each $k$-fold as
$$\Huge \varphi_{\chi^2_k}^2 = \langle \chi^2_k [ \mathcal{T}[f_{\rm fit}], \mathcal{D} ] \rangle_{\rm rep} - \chi^2_k [ \langle \mathcal{T}[f_{\rm fit}] \rangle_{\rm rep}, \mathcal{D} ] $$

where the first term represents our usual averaged-over-replicas hyper loss, $\chi^2_k$, that is calculated based on the dataset used in the fit ($\mathcal{D}$) and the theory predictions from each fitted PDF ($f_{\rm fit}$) replica. The second term of the above equation would involve the calculation of the hyper loss but now using the theory predictions from the averaged-over-replicas PDF.

Refactoring `HyperLoss` class

After a fruitful discussion with @goord, we agreed that, before implementing $\varphi^{2}$, it would be wise to refactor a bit the HyperLoss class. The idea here is to add to HyperLoss the responsibility of properly calculating hyper_losses by passing to it all information needed to do so, e.g.,:

models["experimental"]
pdf_models
self.experimental["output"]
chosen statistics (average, best_worst, std or phi) over replicas and folds
penalties (saturation, patience, and/or integrability) to be added

APJansen · 2023-11-27T13:54:49Z

We talked again about refactoring this, and we decided to keep the changes minimal for now, just to allow implementation of the phi2 loss. We can refactor the code about hyperoptimization inside model_trainer later, once we have things running.

We made some assumptions listed below, are these correct @scarlehoff?
And it's still missing the computation of the phi2 itself, that's still quite tricky.
For now we assumed that there is a function compute_phi2 in vpinterface.py that takes n3pdfs and a dictionary of data and covmat, but we didn't implement anything. We extract these from self.experimental["output"], but still need to convert these to the arguments that result expects, so DataSetSpec etc.
Is that right? You mentioned in #1849 that you would come back to it, I think some help here would be useful if you know more.

Assumptions

Previously HyperLoss was implemented with the assumption that we can first compute all losses per replica and per fold, and at the end compute first a statistic over the replicas, and then over the folds.
This is not true for phi2, as the chi2 of the central pdf cannot be obtained by any statistic over the chi2's of individual pdfs.

So now we assume instead that first the losses are computed per fold, so already having some kind of aggregation of the replicas. This is done in the training loop, as it is quit early if it is too high.
Then we assume that after a trial, we just take some statistic over the fold, as before. We assume this is done for chi2 as well as phi2.

And then there are also penalties, which currently are all computed per replica. We assumed for now that this is general enough, and that we always want to take an average over the per-replica penalties, and add this to the loss.

scarlehoff · 2023-11-27T14:47:51Z

You mentioned in #1849 that you would come back to it, I think some help here would be useful if you know more.

Ah, since there were no questions I thought that the example I put there was enough. Not sure what to add (i.e., not sure what the showstoppers are?)

For now we assumed that there is a function compute_phi2 in vpinterface.py that takes n3pdfs and a dictionary of data and covmat, but we didn't implement anything. We extract these from self.experimental["output"], but still need to convert these to the arguments that result expects, so DataSetSpec etc.

The thing is that, while n3fit converts everything into dictionaries, everything is originally coming from DataSetSpecs, so you can pass those also to the input. I can work of that later on, but first I'd rather merge everything else (I'm going to test today #1818)

Regarding the assumptions:

Previously HyperLoss was implemented with the assumption that we can first compute all losses per replica and per fold, and at the end compute first a statistic over the replicas, and then over the folds.
This is not true for phi2, as the chi2 of the central pdf cannot be obtained by any statistic over the chi2's of individual pdfs.

No. But at the end of the fold (when you have an ensemble of model) you can easily create another model which is the average of all previous models (ideally discarding outliers) over the axis of the replicas. With that you can compute the phi.

And then there are also penalties, which currently are all computed per replica. We assumed for now that this is general enough, and that we always want to take an average over the per-replica penalties, and add this to the loss.

Yes. Let's not care about the penalties for the time being. I don't think they are needed.

Cmurilochem · 2023-11-30T10:57:00Z

Start implementation of $\varphi^{2}$

Continuing on #1849, I am proceeding with a provisory implementation of compute_phi2 in order to calculate $\varphi^{2}_k$ within each $k$ fold.

Implementation ideas (also in #1849)

With the help of @scarlehoff I realised that validphys pass an experiments_data variable to performfit. It contains a list of DataGroupSpec objects for each group exp data that could be directly used by validphys.results functions to obtain $\varphi$.
I am now passing experiments_data as argument to instantiate ModelTrainer inside performfit.
With self.experiments_data and self.exp_info attributes of ModelTrainer, I create inside Model.parametrizable a list of tuples defining validphys.core.DataGroupSpec representations of each group experimental data set and the associated covariant matrices.
This list and the N3PDF object are them used by HyperLoss.compute_loss (designed by @APJansen) and passed as argument to compute_phi2 in vpinterface.py. This allows us to calculate the sum of $\varphi^2$ by resorting to the built-in validphys.results functions: results, abs_chi2_data and phi_data.

Questions

the current calculation of $\varphi^2_{k}$ for each $k$ fold uses all input experimental data sets. Would the correct calculation include only the experimental data of the held out fold?
I am now using the multi-replica N3PDF objects to obtain $\varphi^2_{k}$ as expected by results. I noticed that there was some discussion yesterday about _set_central_value. This has also been discussed by @scarlehoff above. It is still not very clear to me what would be the difference between the calculated results using validphys central values and those obtained by possibly setting the central pdf as PDF_0 as in _set_central_value.
Need to discuss what to do with penalties.
We added new options in the runcard's kfold: section that selects the type of loss ($\varphi^2_{k}$ or $\chi^2_{k}$) and the type of replica/fold statistics, e.g.,

kfold:
      loss: phi2
      replica_statistic: average
      fold_statistic: average
      target: average
      .....

I see here some problems. First, phi2 is a replica statistics for its own and it not compatible with replica_statistic: average. This could be dealt with internally by adding some constraints and warnings. Second, I am not entirely sure if fold_statistic and target represent the same thing; I suspect so, but would appreciate any discussion on that.

@scarlehoff, @RoyStegeman, @APJansen and @goord, any thoughts on this ?

How $\varphi$ is calculated by valiphys

1. results function

When passing validphys.core.DataGroupSpec, N3PDF, experiment covmat and sqrt_covmat

nnpdf/n3fit/src/n3fit/vpinterface.py

Line 365 in 08c623f

res = results(groupdataset, n3pdfs, covmat, sqrt_covmat(covmat))

the results function will output a truple composed by:

an instance of DataResult which holds the relevant information for a given group exp data set
an instance of ThPredictionsResult which holds the corresponding theory predictions from the multi-replica PDF. The attribute ThPredictionsResult.rawdata stores such theory predictions for each replica.

2. abs_chi2_data function

This function will use the the output of results (the above tuple) to calculate:

all_chi2: this corresponds to the usual chi2 calculated by the sum of the squared differences between the predicted ThPredictionsResult.rawdata and central experimental observables for each replica and group exp dataset. all_chi2 is then used to instantiate ThPredictionsResult.stats_class which in our case is always N3Stats ; see also here.
central_chi2: Differently from all_chi2, this corresponds to the chi2 calculated by the sum of squared differences between ThPredictionsResult.central_value, i.e., average of the predictions of each replica PDF, and the central experimental observables for each group exp dataset.

These calculated quantities will be returned, together with the number of exp data points, in a form of a namedtuple named Chi2Data.

3. phi_data function

Finally, phi_data just uses the output data of abs_chi2_data. It calculates $\varphi$ by:

averaging out all_chi2 over replicas using N3Stats.error_members().mean()
subtract <all_chi2> from central_chi2, dividing the result by the number of exp data points within the group exp dataset.
Take the sqrt of [(<all_chi2> - central_chi2)/number of exp data points]

Cmurilochem · 2023-12-05T11:26:32Z

Question regarding added tests

@scarlehoff, I added new tests into test_hyperopt.py.

To test the calculation of $\varphi$, I am generating mocked pdf_models by providing random seeds; see here. However, even when providing the same seeds to pdfNN_layer_generator, I am not able to obtain the same test result when I run it on my local computer and in the CI/CD. Do you think that this would be due to the different TensorFlow versions ? If not, how could I warrant that the weights/pre-processing exponents of my pdf_models are exactly the same ?

scarlehoff · 2023-12-05T11:30:25Z

even when providing the same seeds ...

Make sure the seeds are also fixed for numpy. Note that when the fit is set to debug we do all this

nnpdf/n3fit/src/n3fit/backends/keras_backend/internal_state.py

Line 106 in e470c7e

os.environ.setdefault("HYPEROPT_FMIN_SEED", str(seed))

to ensure reproducibility. You might need some of these steps in your test as well.

however while the tests should be robust in linux (they empirically seem to be) I'm not so sure about mac m1 (i.e., if you are locally running on a mac and then trying to test in the ci... not sure what will happen)

Cmurilochem · 2023-12-06T12:17:22Z

even when providing the same seeds ...

Make sure the seeds are also fixed for numpy. Note that when the fit is set to debug we do all this

nnpdf/n3fit/src/n3fit/backends/keras_backend/internal_state.py

Line 106 in e470c7e

os.environ.setdefault("HYPEROPT_FMIN_SEED", str(seed))

to ensure reproducibility. You might need some of these steps in your test as well.
however while the tests should be robust in linux (they empirically seem to be) I'm not so sure about mac m1 (i.e., if you are locally running on a mac and then trying to test in the ci... not sure what will happen)

Thanks @scarlehoff. I have added a custom set_initial_state to be used together with generate_pdf in test_hyperopt.py. After some unsuccessful trials, I decided to add only property-based tests for $\varphi$ for the moment.
I will come back later to it. For now, I will try first to address the problem of calculating $\varphi^{2}$ only with experimental data of the held out set.

n3fit/src/n3fit/hyper_optimization/rewards.py

n3fit/src/n3fit/vpinterface.py

n3fit/src/n3fit/hyper_optimization/hyper_scan.py

RoyStegeman

Can you add documentation for the different losses/statistics that can be optimised and how the runcard can be changed to use them?

n3fit/src/n3fit/hyper_optimization/hyper_scan.py

n3fit/src/n3fit/model_trainer.py

Cmurilochem · 2024-03-06T20:52:54Z

Can you add documentation for the different losses/statistics that can be optimised and how the runcard can be changed to use them?

Done. Just added docs in f88fbb5

RoyStegeman

Thanks, the docs look good

doc/sphinx/source/n3fit/hyperopt.rst

Co-authored-by: Roy Stegeman <roystegeman@live.nl>

doc/sphinx/source/n3fit/hyperopt.rst

Co-authored-by: Roy Stegeman <roystegeman@live.nl>

n3fit/src/n3fit/checks.py

RoyStegeman mentioned this pull request Aug 30, 2023

Refactor stopping #1792

Merged

3 tasks

Cmurilochem mentioned this pull request Nov 27, 2023

Hyperoptimization metrics - implementation of $\varphi^{2}$ estimator #1849

Closed

RoyStegeman added the escience label Nov 29, 2023

Cmurilochem mentioned this pull request Dec 7, 2023

Failing test_restart_from_pickle in n3fit/tests/test_hyperopt.py #1885

Closed

RoyStegeman linked an issue Dec 7, 2023 that may be closed by this pull request

Failing test_restart_from_pickle in n3fit/tests/test_hyperopt.py #1885

Closed

Cmurilochem mentioned this pull request Dec 7, 2023

Get covariant matrices from DataGroupSpec #1887

Closed

This was linked to issues Dec 11, 2023

Get covariant matrices from DataGroupSpec #1887

Closed

Hyperoptimization metrics - implementation of $\varphi^{2}$ estimator #1849

Closed

Cmurilochem requested review from scarlehoff, RoyStegeman, Radonirinaunimi and goord December 11, 2023 09:27

Cmurilochem marked this pull request as ready for review December 11, 2023 14:44

Cmurilochem mentioned this pull request Dec 13, 2023

Printing Multi-statistics to hyperopt json files #1894

Closed

Cmurilochem linked an issue Jan 18, 2024 that may be closed by this pull request

Printing Multi-statistics to hyperopt json files #1894

Closed

APJansen force-pushed the hyperopt_loss branch from a94f6a2 to edd5e4e Compare February 27, 2024 10:04

Cmurilochem force-pushed the hyperopt_loss branch 3 times, most recently from 8439ff9 to c499725 Compare March 4, 2024 06:13

APJansen mentioned this pull request Mar 4, 2024

Finalizing eScience contributions #1977

Closed

RoyStegeman reviewed Mar 4, 2024

View reviewed changes

RoyStegeman reviewed Mar 5, 2024

View reviewed changes

Cmurilochem force-pushed the hyperopt_loss branch 2 times, most recently from 09f929d to f88fbb5 Compare March 6, 2024 20:51

RoyStegeman approved these changes Mar 7, 2024

View reviewed changes

doc/sphinx/source/n3fit/hyperopt.rst Outdated Show resolved Hide resolved

doc/sphinx/source/n3fit/hyperopt.rst Show resolved Hide resolved

doc/sphinx/source/n3fit/hyperopt.rst Outdated Show resolved Hide resolved

doc/sphinx/source/n3fit/hyperopt.rst Outdated Show resolved Hide resolved

Cmurilochem and others added 17 commits March 7, 2024 14:07

Added 'HyperLoss' class and phi2 loss evaluation

e54cdf5

Update 'rewards.py' module docstring

23ac231

Fix in staticmethods output type hints

91466ec

Update comment in 'vpinterface.py'

3eeb531

Co-authored-by: Roy Stegeman <roystegeman@live.nl>

Update comment in 'hyper_scan.py'

6ac3ef1

Co-authored-by: Roy Stegeman <roystegeman@live.nl>

Update compute_phi2 return value as suggested by Roy

339ceff

Update 'penalties' argument info and example in 'compute_loss'

682e8da

Changed 'compute_phi2' to 'compute_phi'

16cc5bf

Get rid of _status_wrapper

cd95bf5

Remove phi2 from names

7a06180

Update dataset name in test

ce31372

Change 'validation_losses_phi' dictionary key to 'trvl_losses_phi'

fda8e3c

Add docs for the different losses/statistics

d53cee4

Update doc/sphinx/source/n3fit/hyperopt.rst

99df0bf

Co-authored-by: Roy Stegeman <roystegeman@live.nl>

Update doc/sphinx/source/n3fit/hyperopt.rst

1267b0f

Co-authored-by: Roy Stegeman <roystegeman@live.nl>

Update doc/sphinx/source/n3fit/hyperopt.rst

d77cbea

Co-authored-by: Roy Stegeman <roystegeman@live.nl>

Add final Roy's suggestions to docs

2050911

Cmurilochem force-pushed the hyperopt_loss branch from 9cee50c to 2050911 Compare March 7, 2024 13:08

RoyStegeman reviewed Mar 7, 2024

View reviewed changes

doc/sphinx/source/n3fit/hyperopt.rst Show resolved Hide resolved

RoyStegeman reviewed Mar 7, 2024

View reviewed changes

doc/sphinx/source/n3fit/hyperopt.rst Outdated Show resolved Hide resolved

Cmurilochem and others added 2 commits March 7, 2024 14:29

Update doc/sphinx/source/n3fit/hyperopt.rst

8a6312b

Co-authored-by: Roy Stegeman <roystegeman@live.nl>

Update doc/sphinx/source/n3fit/hyperopt.rst

e996b95

Co-authored-by: Roy Stegeman <roystegeman@live.nl>

RoyStegeman reviewed Mar 7, 2024

View reviewed changes

n3fit/src/n3fit/checks.py Outdated Show resolved Hide resolved

Avoid initialization of HyperLoss in checks

33b6d97

RoyStegeman merged commit 4b00608 into master Mar 7, 2024
7 checks passed

RoyStegeman deleted the hyperopt_loss branch March 7, 2024 21:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyperopt loss #1726

Hyperopt loss #1726

APJansen commented May 4, 2023 •

edited by Cmurilochem

Loading

APJansen commented Sep 6, 2023

Cmurilochem commented Nov 22, 2023 •

edited

Loading

APJansen commented Nov 27, 2023

scarlehoff commented Nov 27, 2023

Cmurilochem commented Nov 30, 2023 •

edited

Loading

Cmurilochem commented Dec 5, 2023 •

edited

Loading

scarlehoff commented Dec 5, 2023

Cmurilochem commented Dec 6, 2023

RoyStegeman left a comment •

edited

Loading

Cmurilochem commented Mar 6, 2024

RoyStegeman left a comment

Hyperopt loss #1726

Hyperopt loss #1726

Conversation

APJansen commented May 4, 2023 • edited by Cmurilochem Loading

Tasks done in this PR

Description

Runcard examples

Notes

APJansen commented Sep 6, 2023

Description of what happens with hyperoptimization

Single Trial

Folds

Loss

Managing trials

possible issue

Some thoughts on improvements

Parallelizing trials, checkpointing

K-folding

Code improvements

Cmurilochem commented Nov 22, 2023 • edited Loading

Implementation of a new metrics

Refactoring HyperLoss class

APJansen commented Nov 27, 2023

Assumptions

scarlehoff commented Nov 27, 2023

Cmurilochem commented Nov 30, 2023 • edited Loading

Start implementation of $\varphi^{2}$

Implementation ideas (also in #1849)

Questions

How $\varphi$ is calculated by valiphys

1. results function

2. abs_chi2_data function

3. phi_data function

Cmurilochem commented Dec 5, 2023 • edited Loading

Question regarding added tests

scarlehoff commented Dec 5, 2023

Cmurilochem commented Dec 6, 2023

RoyStegeman left a comment • edited Loading

Choose a reason for hiding this comment

Cmurilochem commented Mar 6, 2024

RoyStegeman left a comment

Choose a reason for hiding this comment

APJansen commented May 4, 2023 •

edited by Cmurilochem

Loading

Cmurilochem commented Nov 22, 2023 •

edited

Loading

Refactoring `HyperLoss` class

Cmurilochem commented Nov 30, 2023 •

edited

Loading

Cmurilochem commented Dec 5, 2023 •

edited

Loading

RoyStegeman left a comment •

edited

Loading