Skip to content

Solutions for the Distributed System Challenges from Fly.io and Kyle Kingsbury

Notifications You must be signed in to change notification settings

fersilva16/gossip-glomers

Repository files navigation

Gossip Glomers

Solutions for the Distributed System Challenges from Fly.io and Kyle Kingsbury

Solutions

Hello world! Just echoes back what you send it.

solution / tests

Generates a unique ID for each node. Inspired by MongoDB's ObjectId.

Each ID is a concatenation of:

  • Node ID: to avoid collisions with other nodes
  • Current time: to avoid collisions with messages sent at the same time
  • Count: to avoid collisions with messages sent at the same time and node

solution / tests

A basic node that receives messages, saves them to a slice, and returns on the read message.

solution / tests

Same as the previous challenge, but now it broadcast the message to all neighbors.

solution / tests

Node spawns a new thread for each message sent to a neighbor, and exponentially backoff if the neighbor doesn't respond.

This solution is pretty dumb and not the most efficient one, here's other ideas of how it could be done:

  • Use a map to keep track of which messages are not yet acknowledged and retry them on the next messages
  • Use a separated thread with a ticker to periodically distribute the messages to neighbors

solution / tests

The node receiving the message broadcasts to all the other nodes in network with a send-and-forget approach with the gossip message type.

Results from a run:

All checks passed
  Messages per op: 11.789474/30
  Median latency: 84/400
  Maximum latency: 105/600

solution / tests

The previous solution also works for this challenge, but I've taken a step further and tried to make it send as less messages per operation as possible.

It's the same idea as #3d, but it spawns a new thread that broadcasts the messages every 1.5s with a buffered channel and only broadcasts if there are new messages.

Results:

All checks passed
  Messages per op: 4.728111/20
  Median latency: 791/1000
  Maximum latency: 1584/2000

solution / tests

I've made 2 solutions for this challenge, one that uses a OT approach and another with a CRDT solution.

OT (Operational Transformation) solution uses the Sequential KV store from Maelstrom and tries to Compare and Swap the value of the counter.

solution / tests

CRDT (Conflict-free Replicated Data Type) solution uses a map to store the counter of the other nodes and propagates its own when receives an add message.

solution / tests

Links:

solution / tests

Project Structure

This repo is a Go workspace where each solution is in a separate module.

Each solution is in a folder with the main.go and main_test.go for tests.

Tests

All solutions have automated tests using custom utils as a way to verify and play with the solutions, and to learn how to write tests in Go.

Running

This project uses Nix for packages and Taskfile to run and build the solutions.

To run the solutions, use the run-* tasks:

task run-all
task run-uniqueids
task run-broadcast-efficient-ii1 # Will run and check the result output

To run the tests, use the test-* tasks:

task test-all
task test-uniqueids
task test-broadcast-efficient-ii

About

Solutions for the Distributed System Challenges from Fly.io and Kyle Kingsbury

Topics

Resources

Stars

Watchers

Forks