Set up best practice for easy record serialization + parametrization + reproducibility #105

nicholasjng · 2024-03-11T10:33:10Z

Currently, we have no serious stress test for our record IO in place - the examples all write structs of standard data types, which are really simple to deal with.

This changes potentially with the merge of #103, which adds a parameter struct to the records. That is absolutely necessary for keeping track of what actually went into the benchmarks, but it implicitly sets us up for a serious problem: How do we deal with results for benchmarks that take non-standard data types like models, functions, algorithms etc.?

Consider the following example:

import nnbench

class MyModel:
    ...

@nnbench.benchmark
def accuracy(m: MyModel, data: Any) -> float:
    ...

Serializing a record coming out of a benchmark run including accuracy can potentially be really challenging, since it is unclear how to represent MyModel in a written record.

There are multiple ways around this: First there is the option of requiring the user to make their records conform, but this has the downside of extra work for them, and can break reproducibility if the chosen representation lacks reproducibility. Adding serializer hooks for custom data types is an option, but it's convoluted and a lot of work.

Then there is the option of making the benchmarks take a unique identifier for the model instead, which is a standard data type (e.g. a hash, a remote URI etc.), and loading the artifact just in time for the user to access. This should mean easier reading/writing of records, but requires more code for the benchmark setup. It also requires us to change our story for the model artifact benchmarks, where we would need to come up with a way to efficiently instantiate models based on some information.

We should be able to use a setup task for the benchmarks to accomplish this. A cache lookup for the proper artifact, which is loaded before the start of all benchmarks seems like a good example.

The text was updated successfully, but these errors were encountered:

nicholasjng · 2024-03-26T09:06:40Z

Summary of developments that have since happened to address this issue:

Regarding serialization of parameters, a nnbench.transform submodule has been checked in in Add transform submodule, parameter compression transform #124 that contains a barebones parameter serialization transform. If it is not sufficient for advanced uses or custom types, users can follow the spirit of the transform class to implement their own.
Just-in-time artifact loading has been implemented using Memos in Feature: Thunks as an alternative to artifacts #120. What is left is caching of the artifacts in an interpreter-local cache, and controlling lifetimes of big cached artifacts by garbage collection in setup and teardown tasks (see State injection into setup and teardown #127 and Add global memo cache and integrate with the setUp and teardown injection #130).

nicholasjng added the urgent Needs to be worked on ASAP. label Mar 11, 2024

Maciej818 assigned nicholasjng Mar 12, 2024

nicholasjng mentioned this issue Mar 27, 2024

Implement memo garbage collection #137

Merged

nicholasjng closed this as completed in #137 Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set up best practice for easy record serialization + parametrization + reproducibility #105

Set up best practice for easy record serialization + parametrization + reproducibility #105

nicholasjng commented Mar 11, 2024

nicholasjng commented Mar 26, 2024 •

edited

Loading

Set up best practice for easy record serialization + parametrization + reproducibility #105

Set up best practice for easy record serialization + parametrization + reproducibility #105

Comments

nicholasjng commented Mar 11, 2024

nicholasjng commented Mar 26, 2024 • edited Loading

nicholasjng commented Mar 26, 2024 •

edited

Loading