Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add Window and SharedPointerWindow data structures #461

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

dnut
Copy link
Contributor

@dnut dnut commented Dec 19, 2024

These changes are needed by #459, where they are used to track epoch states.

Previously I used an Lru to track epoch states, but the Lru requires exclusive locks on read operations, which introduces too much contention in a data structure that will be read often by many threads. The data structures in this PR were implemented to handle this use case a bit better. They can be read without an exclusive lock, and are intended to handle the predictable temporal nature of epoch transitions.

Window is the basic data structure that supports the idea of tracking a moving window of values.

SharedPointerWindow is a wrapper for Window that adds two features:

  • thread safety via RwLock.
  • manages the lifetimes of the contained data with reference counting.

SharedPointerWindow could be implemented totally generically, enabling the user to specify any arbitrary internal container type, such as an Lru or a HashMap. But this makes the type a little bit more complex/opaque so I haven't implemented it generically. This could be done in the future if the behavior is actually needed to wrap multiple different underlying data containers.

@dnut dnut requested a review from yewman December 19, 2024 12:32
@dnut dnut marked this pull request as ready for review December 19, 2024 12:32
Base automatically changed from dnut/refactor/rc to main December 20, 2024 21:02
Copy link
Contributor

@yewman yewman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, a couple of minor comments. Feel free to resolve them if you disagree as I do not feel strongly about either comment, just thought I'd share my view

allocator: Allocator,
window: Window(Rc(T)),
center: std.atomic.Value(usize),
lock: std.Thread.RwLock = .{},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless explicitly required I prefer the use of a RwLock wrapper around the necessary data which in this instance is window as far as I understand. I understand this is a matter of taste to some extent though so am ok with leaving it as is if you feel more strongly about it.

Copy link
Contributor Author

@dnut dnut Dec 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and I would argue that I am already using the pattern that you are advocating. SharedWindowPointer is the locking wrapper struct which wraps the Window type. This struct exists to serve as an alternative to RwMux that provides stronger guarantees about synchronization correctness.

The PR description discusses this somewhat. I feel that this struct implements the general pattern of read-write-locking a type that has a map-style api.

Layering on the RwMux struct feels redundant to me in this case. It add an extra layer, when this struct is supposed to serve the same role. RwMux doesn't actually abstract away any complexity from this struct.

/// Inserts the item into the Window, as long as its index
/// is within the current allowed bounds of the Window.
pub fn put(self: *Self, index: usize, item: T) error{OutOfBounds}!?T {
if (!self.isInRange(index)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: tend to prefer single line if it fits within out col limits but not that fussed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically I'll agree with collapsing if-statements when they can fit on a single line, but if it's a return or a break, I give it a second thought. This was a conscious decision to break it into multiple lines.

I feel that the exit points of a function are critical to understand how the function works, and they should be given extra emphasis compared to other code. Putting it at the end of the line makes it less obvious. I find it easier to comprehend a function's overall flow when all the return statements are isolated on their own lines.

I think the benefit of reducing lines of code is not worth the clarity tradeoff, unless there are many instances of the same pattern, and the repetition makes the pattern obvious. For example, if the function starts with five lines of code that all end in orelse return null, then I think it's overkill to allocate extra lines of code for that. But if it's just a single return, I try to highlight it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 👀 In review
Development

Successfully merging this pull request may close these issues.

2 participants