core: Provide read access to commit/message log and odb #265

kim · 2023-09-01T13:24:45Z

Description of Changes

This is a first approximation to provide data access for Kafka-style replication. It punts on notifications of segment rotation (which could also be solved via filesystem events).

ObjectDB access is fairly crude, and will likely require it's own replication subsystem.

~~Note that this is against cloud-next.~~

API and ABI

This is a breaking change to the module ABI
This is a breaking change to the module API
This is a breaking change to the ClientAPI
This is a breaking change to the SDK API

If the API is breaking, please state below what will break

This is a first approximation to provide data access for Kafka-style replication. It punts on notifications of segment rotation (which could also be solved via filesystem events). ObjectDB access is fairly crude, and will likely require it's own replication subsystem.

kim · 2023-09-05T06:26:02Z

This will probably do, so rebased into master

cloutiertyler

More or less seems okay, although I have some important questions inline.

cloutiertyler · 2023-09-09T07:30:10Z

crates/core/src/db/message_log.rs

+    /// The iterator represents a _snapshot_ of the log at the time this method
+    /// is called. That is, segments created after the method returns will not
+    /// appear in the iteration. The last segment yielded by the iterator may be
+    /// incomplete (i.e. still be appended to).


These doc comments are A+ 🙏

cloutiertyler · 2023-09-09T07:37:58Z

crates/core/src/db/commit_log.rs

+
+/// A read-only view of a [`CommitLog`].
+pub struct CommitLogView {
+    mlog: Option<Arc<Mutex<MessageLog>>>,


I might suggest implementing the CommitLog by giving it a CommitLogView internally, since it does already share the same fields. Hmm that doesn't seem like the right solution either, but I wonder if we can have a clearer abstraction here.

It seems like we are almost trying to create the concept of a "File" (i.e. CommitLogView) and a "BufferedFile" (i.e. CommitLog) where the buffered thing is unwritten commit which is only periodically flushed. I wonder if something like that might fit better down the road.

The purpose of CommitLogView is for RelationalDB to hand out a view over its internal state in order to serve up log segments. It must hide append, but changing the visibility of append isn't helpful.

CommitLogView could be thought of as a newtype around CommitLog, but internally that'd be copying more data than strictly required. Perhaps if there was an Arc around the whole CommitLog, but I'm always wary about contention.

cloutiertyler · 2023-09-09T07:41:57Z

crates/core/src/db/message_log.rs

+/// The underlying file is opened lazily when calling [`SegmentView::try_into_iter`]
+/// or [`SegmentView::try_into_file`].
+#[derive(Clone, Debug)]
+pub struct SegmentView {


Why is this required instead of MessageLogIter? Also presumably we should implement the internals of MessageLogIter with SegementView because we're duplicating some code that really needs to stay in sync.

I was going to ask about MessageLogIter, and perhaps submit a separate patch: the issue is, it has very different semantics. For one, it panics on I/O errors, which isn't great when it's not a matter of database consistency. More importantly, though, segments_for_offset returns the first segment if offset does not fall into the last segment -- the significance of which I don't understand.

So imho this would require to write a number of tests at least, which I wouldn't want to conflate with this patch (which does not change any database internals).

Why is this required instead of MessageLogIter?

It is also not helpful to borrow MessageLog when that value is behind a mutex.

Oh and also note that it is useful to yield whole segments (in addition to a stream of messages) to allow consumers which are very far behind to fetch all of them in parallel.

Also make max segment size configurable, so tests don't have to write >= 1GiB worth of data.

cloutiertyler

LGTM ultimately, although I'd like to think about a different philosophy on this

kim requested review from kulakowski and cloutiertyler September 1, 2023 13:24

kim force-pushed the kim/log-access branch from dec9b93 to 6b27425 Compare September 5, 2023 06:24

kim changed the base branch from kim/cloud-next to master September 5, 2023 06:25

cloutiertyler reviewed Sep 9, 2023

View reviewed changes

kim added 5 commits September 11, 2023 08:31

Merge remote-tracking branch 'origin/master' into kim/log-access

bd30d66

fixup! Merge remote-tracking branch 'origin/master' into kim/log-access

fa71fbc

Add some tests

89b65ce

Also make max segment size configurable, so tests don't have to write >= 1GiB worth of data.

Abide to the Rust naming conventions

f76bd7c

Add test for commit iter

5d5d6aa

kim mentioned this pull request Sep 11, 2023

Remove MessageLogIter, use commit_log::Iter instead #272

Merged

Remove MessageLogIter, use commit_log::Iter instead (#272)

ba98666

cloutiertyler approved these changes Oct 3, 2023

View reviewed changes

cloutiertyler enabled auto-merge (squash) October 3, 2023 22:00

cloutiertyler merged commit e8aed85 into master Oct 3, 2023
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core: Provide read access to commit/message log and odb #265

core: Provide read access to commit/message log and odb #265

kim commented Sep 1, 2023 •

edited

Loading

kim commented Sep 5, 2023

cloutiertyler left a comment

cloutiertyler Sep 9, 2023

cloutiertyler Sep 9, 2023

kim Sep 11, 2023

cloutiertyler Sep 9, 2023

kim Sep 11, 2023

kim Sep 11, 2023

kim Sep 11, 2023 •

edited

Loading

cloutiertyler left a comment

core: Provide read access to commit/message log and odb #265

core: Provide read access to commit/message log and odb #265

Conversation

kim commented Sep 1, 2023 • edited Loading

Description of Changes

API and ABI

kim commented Sep 5, 2023

cloutiertyler left a comment

Choose a reason for hiding this comment

cloutiertyler Sep 9, 2023

Choose a reason for hiding this comment

cloutiertyler Sep 9, 2023

Choose a reason for hiding this comment

kim Sep 11, 2023

Choose a reason for hiding this comment

cloutiertyler Sep 9, 2023

Choose a reason for hiding this comment

kim Sep 11, 2023

Choose a reason for hiding this comment

kim Sep 11, 2023

Choose a reason for hiding this comment

kim Sep 11, 2023 • edited Loading

Choose a reason for hiding this comment

cloutiertyler left a comment

Choose a reason for hiding this comment

kim commented Sep 1, 2023 •

edited

Loading

kim Sep 11, 2023 •

edited

Loading