Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(Gossip KV): Merge Gossip KV into main. #1535

Merged
merged 36 commits into from
Nov 5, 2024
Merged

feat(Gossip KV): Merge Gossip KV into main. #1535

merged 36 commits into from
Nov 5, 2024

Conversation

rohitkulshreshtha
Copy link
Contributor

There are some TODOs in the code for which I have created issues for 0.10 milestone.

P0 - Must Have for 0.10
P1 - Good to have / stretch goals
P2 - Low Priority

Closes #1530.

rohitkulshreshtha and others added 30 commits June 20, 2024 12:51
Included in this commit:
* Elementary kubernetes YAMLs
* Instructions in `README.md` to install, build images and deploy to
minikube
* Overall `README.md` (extremely rough draft) of the architecture of the
system
… store. (#1246)

The data model supported by the layer 0 key-value store is analogous to
the following JSON:

```json
{
   "sys": [
      "members": [
              "key": "value"
      ]
   ],
   "usr": [
       "table_a": [
           ...
       ],
       "table_b": [
           ...
       ],

   ]
}
```

A single `TableMap` can model `sys` or `usr`. We're only planning for
these two namespaces for now, so no higher-level lattices allow for more
namespaces.

Initially, I started with support for deleting an entire table
efficiently, but I have decided to drop that until needed. If it isn't
needed in practice, the lack of a sharp operation that can delete an
entire table leads to a safer system. It also simplified the code a fair
bit.

The tests may feel like they're testing lattice merging (and therefore
redundant), but I wrote these to convince myself that the type
signatures of the data model are working. I may remove them in the
future (or not!).
With this change, the water flows through the pipes, with Gets, Sets &
Deletes primarily working. Some minor issues remain, but I'll do those
separately.

Next steps:
#1253
#1257
#1256
#1255
#1254
It was not the most critical thing to get in, but it bothered me a bit.
This code change allows '/' to be used in table and row names. This
doesn't affect memory layout in any way.

The most critical part of this change is the additional tests to check
for empty table names and row keys.
…tiple values for concurrent writes) (#1275)

These changes separate the server logic from the networking logic,
allowing the server logic to be tested with high fidelity in a unit
test. The changes allow for an abstract replacement of `SocketAddr`,
which can be used for deterministic testing of more complex scenarios in
unit tests.

The first feature to use these testing capabilities is concurrent writes
to the same key at the same tick, which should result in multiple values
being returned from a `Get`.
* Added support for hierarchical configuration using `config` crate &
used it to set seed node information
* Some cleanup for types & interfaces
This change implements a blind gossip protocol where the gossipers don't
receive (or care to receive) responses from the systems they gossip
with.

It also adds a deterministic testing framework for testing gossip.

Once request-reply is added, it can stop being blind.

Co-authored-by: Mingwei Samuel <mingwei.samuel@gmail.com>
This ensures that an infected node continues to infect other nodes until
it loses "interest" (because the nodes it contacted are already
infected).
# Conflicts:
#	hydroflow_lang/src/graph/ops/state.rs
When I started, I was using the published Hydroflow. I have since
switched to using local hydroflow, which means the Dockerfile needs to
be updated for the build to work correctly. This posed a more
significant challenge than expected, primarily because of disk space
issues, which have been sorted out.
Summary of Changes:
1. Some cleanup/bug fixes with docker files.
2. Install `dig` on CLI to test DNS
3. Fixed gossip ports everywhere - using the wrong port was annoying.
4. Cluster tear-down is much faster - reduced pod termination wait from
30s (default) to 5s.
5. New static config for three servers - not planning to automate this
now.
6. Gossip trigger timer wired in - this was earlier separated out for
deterministic testing.
7. Not randomizing host-names at the moment - this was making testing
difficult, although I think this does help in production with restarts
and such.
8. Gossip assumed > 0 available peers - fixed that issue.
Remove extra target which exists by two names.
I created a load-testing binary with an in-process network—deployed and
tested with five threads @ one request per second per thread.
@rohitkulshreshtha rohitkulshreshtha added the Datastores/Gossip-KV Pertaining to Gossip KV datastore label Nov 5, 2024
@rohitkulshreshtha rohitkulshreshtha added this to the 0.10 milestone Nov 5, 2024
Copy link

cloudflare-workers-and-pages bot commented Nov 5, 2024

Deploying hydroflow with  Cloudflare Pages  Cloudflare Pages

Latest commit: 8a5890e
Status: ✅  Deploy successful!
Preview URL: https://cdb26d63.hydroflow.pages.dev
Branch Preview URL: https://1533-gossip-kv-remove-code-d.hydroflow.pages.dev

View logs

Copy link
Member

@MingweiSamuel MingweiSamuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉🎉🎉

@rohitkulshreshtha rohitkulshreshtha merged commit 3143bf5 into main Nov 5, 2024
15 checks passed
@rohitkulshreshtha rohitkulshreshtha deleted the anna_v2 branch November 5, 2024 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datastores/Gossip-KV Pertaining to Gossip KV datastore
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gossip KV: Merge into main
2 participants