Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: vault #5

Merged
merged 34 commits into from
Feb 5, 2024
Merged

feat: vault #5

merged 34 commits into from
Feb 5, 2024

Conversation

callumtilbury
Copy link
Contributor

@callumtilbury callumtilbury commented Dec 9, 2023

What?

Vault is an efficient mechanism for saving flashbax buffers to persistent data storage.

How?

Vault uses tensorstore, which is also used by Google's orbax checkpointing library. Tensorstore has a useful ability to read and write slices of buffers at a time. i.e. I could have a 100TB file saved on disk, but easily and efficiently access the first element. Building on this feature, with flashbax buffers saved in the form of (Batch, Time, Experience), we slice data along the time axis.

Vault consumes flashbax buffer states and writes to disk from the last received time index to the buffer's current time index. This must be done before the ring buffer overwrites any stale data that has not yet been written to the vault. All other bookkeeping is done by vault itself, and using it is as simple as: v.write(buffer_state)

It is usually helpful to look at the demonstrative notebook, which adds timesteps to the buffer, writing to the vault after each one. The following output is yielded:

------------------
Buffer state:
[[[0.]
  [0.]
  [0.]
  [0.]
  [0.]]]

Vault state:
[]
------------------
------------------
Buffer state:
[[[1.]
  [0.]
  [0.]
  [0.]
  [0.]]]

Vault state:
[[[1.]]]
------------------
------------------
Buffer state:
[[[1.]
  [2.]
  [0.]
  [0.]
  [0.]]]

Vault state:
[[[1.]
  [2.]]]
------------------
------------------
Buffer state:
[[[1.]
  [2.]
  [3.]
  [0.]
  [0.]]]

Vault state:
[[[1.]
  [2.]
  [3.]]]
------------------
------------------
Buffer state:
[[[1.]
  [2.]
  [3.]
  [4.]
  [0.]]]

Vault state:
[[[1.]
  [2.]
  [3.]
  [4.]]]
------------------
------------------
Buffer state:
[[[1.]
  [2.]
  [3.]
  [4.]
  [5.]]]

Vault state:
[[[1.]
  [2.]
  [3.]
  [4.]
  [5.]]]
------------------
------------------
Buffer state:
[[[6.]
  [2.]
  [3.]
  [4.]
  [5.]]]

Vault state:
[[[1.]
  [2.]
  [3.]
  [4.]
  [5.]
  [6.]]]
------------------
------------------
Buffer state:
[[[6.]
  [7.]
  [3.]
  [4.]
  [5.]]]

Vault state:
[[[1.]
  [2.]
  [3.]
  [4.]
  [5.]
  [6.]
  [7.]]]
------------------
------------------
Buffer state:
[[[6.]
  [7.]
  [8.]
  [4.]
  [5.]]]

Vault state:
[[[1.]
  [2.]
  [3.]
  [4.]
  [5.]
  [6.]
  [7.]
  [8.]]]
------------------

Why?

Variety of reasons—but mainly useful for Offline RL methods, and saving/loading buffers from checkpoints.

@callumtilbury callumtilbury marked this pull request as ready for review January 17, 2024 15:06
sash-a
sash-a previously requested changes Jan 24, 2024
Copy link
Contributor

@sash-a sash-a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nitpicks and a suggestion for the list issue. Review still in progress though

flashbax/vault/vault.py Outdated Show resolved Hide resolved
flashbax/vault/vault.py Outdated Show resolved Hide resolved
flashbax/vault/vault.py Outdated Show resolved Hide resolved
flashbax/vault/vault.py Outdated Show resolved Hide resolved
flashbax/vault/vault.py Outdated Show resolved Hide resolved
flashbax/vault/vault.py Outdated Show resolved Hide resolved
flashbax/vault/vault.py Outdated Show resolved Hide resolved
flashbax/vault/vault.py Outdated Show resolved Hide resolved
flashbax/vault/vault.py Outdated Show resolved Hide resolved
flashbax/vault/vault.py Show resolved Hide resolved
Copy link
Contributor

@EdanToledo EdanToledo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! thank you so much Callum for your hard work and dedicated time to support the offline RL community. They will be very appreciative and thankful.

@callumtilbury callumtilbury dismissed sash-a’s stale review February 5, 2024 15:58

Stale & now Sasha is on leave :')

@callumtilbury callumtilbury merged commit 21ce0b0 into main Feb 5, 2024
3 checks passed
@callumtilbury callumtilbury deleted the feat/vault branch February 5, 2024 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants