Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up best practice for easy record serialization + parametrization + reproducibility #105

Closed
nicholasjng opened this issue Mar 11, 2024 · 1 comment · Fixed by #137
Closed
Assignees
Labels
urgent Needs to be worked on ASAP.

Comments

@nicholasjng
Copy link
Collaborator

Currently, we have no serious stress test for our record IO in place - the examples all write structs of standard data types, which are really simple to deal with.

This changes potentially with the merge of #103, which adds a parameter struct to the records. That is absolutely necessary for keeping track of what actually went into the benchmarks, but it implicitly sets us up for a serious problem: How do we deal with results for benchmarks that take non-standard data types like models, functions, algorithms etc.?

Consider the following example:

import nnbench

class MyModel:
    ...

@nnbench.benchmark
def accuracy(m: MyModel, data: Any) -> float:
    ...

Serializing a record coming out of a benchmark run including accuracy can potentially be really challenging, since it is unclear how to represent MyModel in a written record.

There are multiple ways around this: First there is the option of requiring the user to make their records conform, but this has the downside of extra work for them, and can break reproducibility if the chosen representation lacks reproducibility. Adding serializer hooks for custom data types is an option, but it's convoluted and a lot of work.

Then there is the option of making the benchmarks take a unique identifier for the model instead, which is a standard data type (e.g. a hash, a remote URI etc.), and loading the artifact just in time for the user to access. This should mean easier reading/writing of records, but requires more code for the benchmark setup. It also requires us to change our story for the model artifact benchmarks, where we would need to come up with a way to efficiently instantiate models based on some information.

We should be able to use a setup task for the benchmarks to accomplish this. A cache lookup for the proper artifact, which is loaded before the start of all benchmarks seems like a good example.

@nicholasjng nicholasjng added the urgent Needs to be worked on ASAP. label Mar 11, 2024
@nicholasjng
Copy link
Collaborator Author

nicholasjng commented Mar 26, 2024

Summary of developments that have since happened to address this issue:

  1. Regarding serialization of parameters, a nnbench.transform submodule has been checked in in Add transform submodule, parameter compression transform #124 that contains a barebones parameter serialization transform. If it is not sufficient for advanced uses or custom types, users can follow the spirit of the transform class to implement their own.
  2. Just-in-time artifact loading has been implemented using Memos in Feature: Thunks as an alternative to artifacts #120. What is left is caching of the artifacts in an interpreter-local cache, and controlling lifetimes of big cached artifacts by garbage collection in setup and teardown tasks (see State injection into setup and teardown #127 and Add global memo cache and integrate with the setUp and teardown injection #130).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
urgent Needs to be worked on ASAP.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant