Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETS backend write-ahead log #554

Merged
merged 12 commits into from
Feb 7, 2024
Merged

ETS backend write-ahead log #554

merged 12 commits into from
Feb 7, 2024

Conversation

scohen
Copy link
Collaborator

@scohen scohen commented Jan 16, 2024

Our store was using :ets.tab2file to write its entries to the filesystem, which would require writing the entire index every time it was called. This is problematic, as updates to the index come in whenever code is added or removed. This would also cause a lot of contention during startup because of the amount of data being read off the filesystem.

This branch introduces a write ahead log, using erlang's built in disk_logger module. The idea is that the WAL writes operations to disk before they're applied to ETS, so that when we start up, we first read a checkpoint, then apply the operations one-by-one to get to the prior state.

Note that this approach is in contrast to using Khepri. Khepri is very nice, but it's significantly slower than this approach, and it uses a lot more memory. It's also worth noting that the WAL based approach can have its memory efficiency improved by removing the need to shuffle all of the entries around between processes, but that's future work.

@scohen
Copy link
Collaborator Author

scohen commented Jan 16, 2024

Implements #538

@scohen scohen changed the title WAL backed store ETS backend write-ahead log Jan 17, 2024
Copy link
Collaborator

@zachallaun zachallaun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this! I was actually just thinking we could use a WAL for this the other day.

Base automatically changed from namespace-config to main January 17, 2024 19:16
@scohen scohen requested a review from zachallaun January 17, 2024 21:34
Copy link
Collaborator

@zachallaun zachallaun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't look into the failing tests right now but the changes look good.

@scohen scohen force-pushed the disk_log branch 7 times, most recently from 4a5c5fe to f90d2d5 Compare January 19, 2024 16:49
@scohen scohen mentioned this pull request Jan 20, 2024
scohen added 10 commits February 7, 2024 10:57
Our store was using :ets.tab2file to write its entries to the
filesystem, which would require writing the entire index every time it
was called. This is problematic, as updates to the index come in
whenever code is added or removed. This would also cause a lot of
contention during startup because of the amount of data being read off
the filesystem.

This branch introduces a write ahead log, using erlang's built in
`disk_logger` module. The idea is that the WAL writes operations to
disk before they're applied to ETS, so that when we start up, we first
read a checkpoint, then apply the operations one-by-one to get to the
prior state.

Note that this approach is in contrast to using Khepri. Khepri is very
nice, but it's significantly slower than this approach, and it uses a
lot more memory. It's also worth noting that the WAL based approach
can have its memory efficiency improved by removing the need to
shuffle all of the entries around between processes, but that's future
work.
We were calling start_supervised, which meant the store wasn't around
to do cleanup. Instead, I added a destroy_all function which blows
away all of a backend's indexes, but doesn't rely on the store or
backend being up. This seems to help a bit.
@scohen scohen merged commit 5cae0eb into main Feb 7, 2024
9 checks passed
@scohen scohen deleted the disk_log branch February 7, 2024 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants