diff --git a/README.md b/README.md
index b06d2238ab6..0b7b9ac739e 100644
--- a/README.md
+++ b/README.md
@@ -6,9 +6,9 @@
-Hydro is a novel distributed programming library for standard Rust. Hydro allows developers to build distributed systems that are efficient, scalable, and correct.
+Hydro is a high-level distributed programming framework for Rust. Hydro can help you quickly write scalable distributed services that are correct by construction. Much like Rust helps with memory safety, Hydro helps with [**distributed safety**](https://hydro.run/docs/hydro/correctness.md).
-Hydro integrates naturally into standard Rust constructs and IDEs, providing types and programming constructs for ensuring distributed correctness. Under the covers it provides a metaprogrammed compiler that optimizes for cross-node issues of scaling and data movement while leveraging Rust and LLVM for per-node performance.
+Hydro integrates naturally into standard Rust constructs and IDEs, providing types and programming constructs for ensuring distributed safety. Under the covers it provides a metaprogrammed compiler that optimizes for cross-node issues of scaling and data movement while leveraging Rust and LLVM for per-node performance.
We often describe Hydro via a metaphor: *LLVM for the cloud*. Like LLVM, Hydro is a layered compilation framework with a low-level Internal Representation language. In contrast to LLVM, Hydro focuses on distributed aspects of modern software.
@@ -17,10 +17,10 @@ We often describe Hydro via a metaphor: *LLVM for the cloud*. Like LLVM, Hydro i
-## The Language (and the Low-Level IR)
-Hydro provides a [high-level language](https://hydro.run/docs/hydro) that allows you to program an entire fleet of processes from a single program, and then launch your fleet locally or in the cloud via [Hydro Deploy](https://hydro.run/docs/deploy). Get started with Hydro via the language [documentation](https://hydro.run/docs/hydro) and [examples](https://github.com/hydro-project/hydro/tree/main/hydro_test/examples).
+## The Hydro API (and the Low-Level IR)
+Hydro provides a [high-level API](https://hydro.run/docs/hydro) that allows you to program an entire fleet of processes from a single program, and then launch your fleet locally or in the cloud via [Hydro Deploy](https://hydro.run/docs/deploy). Get started with Hydro via the API [documentation](https://hydro.run/docs/hydro) and [examples](https://github.com/hydro-project/hydro/tree/main/hydro_test/examples).
-> Internally, the Hydro stack compiles Hydro programs into a low-level Dataflow Internal Representation (IR) language called [DFIR](https://hydro.run/docs/dfir); each process corresponds to a separate DFIR program. In rare cases you may want to compose one or more processes in DFIR by hand; see the DFIR [documentation](https://hydro.run/docs/dfir) or [examples](https://github.com/hydro-project/hydro/tree/main/dfir_rs/examples) for details.
+> Internally, the Hydro stack compiles Hydro programs into a low-level single-threaded DataFlow Internal Representation (IR) language called [DFIR](https://hydro.run/docs/dfir); each Hydro process corresponds to a separate DFIR program. In rare cases you may want to hand-author one or more processes in DFIR; see the DFIR [documentation](https://hydro.run/docs/dfir) or [examples](https://github.com/hydro-project/hydro/tree/main/dfir_rs/examples) for details.
## Development Setup
@@ -32,7 +32,7 @@ There have been many frameworks and platforms for distributed programming over t
**Higher level frameworks** have been designed to serve specialized distributed use cases. These including *Client-Server (Monolith)* frameworks (e.g. Ruby on Rails + DBMS), parallel *Bulk Dataflow* frameworks (e.g. Spark, Flink, etc.), and step-wise *Workflows / Pipelines / Serverless / μservice Orchestration* frameworks (e.g. Kafka, Airflow). All of these frameworks offer limited expressibility and are inefficient outside their sweet spot. Each one ties developers' hands in different ways.
**Lower level asynchronous APIs** provide general-purpose distributed interfaces for sequential programming, including
- *RPCs*, *Async/Await* frameworks and *Actor* frameworks (e.g. Akka, Ray, Unison, Orleans, gRPC). These interfaces allow developers to build distributed systems *one async sequential process* at a time. While they offer low-level control of individual processes, they provide minimal help for global correctness of the fleet.
+ *RPCs*, *Async/Await* frameworks and *Actor* frameworks (e.g. Akka, Ray, Unison, Orleans, gRPC). These interfaces allow developers to build distributed systems *one async sequential process* at a time. While they offer low-level control of individual processes, they provide minimal help with ensuring the global correctness of the fleet.
## Towards a more comprehensive approach
What's wanted, we believe, is a proper language stack addressing distributed concerns:
diff --git a/docs/docs/hydro/correctness.md b/docs/docs/hydro/correctness.md
index 69a9981a2a6..e09d4bae807 100644
--- a/docs/docs/hydro/correctness.md
+++ b/docs/docs/hydro/correctness.md
@@ -3,8 +3,8 @@ sidebar_position: 3
---
# Safety and Correctness
-Just like Rust's type system helps you avoid memory safety bugs, Hydro helps you ensure **distributed safety**. Hydro's type systems helps you avoid many kinds of distributed systems bugs, including:
-- Non-determinism due to message delays (which reorder arrival) or retries (which result in duplicates)
+Much like Rust's type system helps ensure memory safety, Hydro helps ensure **distributed safety**. Hydro's type system helps you avoid many kinds of distributed systems bugs, including:
+- Non-determinism due to message delays (which affect arrival order), interleaving across streams (which affect order of handling) or retries (which result in duplicates)
- See [Live Collections / Eventual Determinism](./live-collections/determinism.md)
- Using mismatched serialization and deserialization formats across services
- See [Locations and Networking](./locations/index.md)
diff --git a/docs/docs/hydro/dataflow-programming.mdx b/docs/docs/hydro/dataflow-programming.mdx
index 4fd51a2665a..c787cd61fdd 100644
--- a/docs/docs/hydro/dataflow-programming.mdx
+++ b/docs/docs/hydro/dataflow-programming.mdx
@@ -3,7 +3,7 @@ sidebar_position: 1
---
# Dataflow Programming
-Hydro uses a dataflow programming model, which will be familiar if you have used APIs like Rust iterators. Instead of using RPCs or async/await to describe distributed computation, Hydro instead uses **asynchronous streams**, which represent data arriving over time. Streams can represent a series of asynchronous events (e.g. inbound network requests) or a sequence of data items.
+Hydro uses a dataflow programming model, which will be familiar if you have used APIs like Rust iterators. Instead of using RPCs, async/await or actors to describe distributed computation, Hydro instead uses **asynchronous streams**, which represent data arriving over time. Streams can represent a series of asynchronous events (e.g. inbound network requests) or a sequence of data items.
Programs in Hydro describe how to **transform** streams and other collections of data using operators such as `map` (transforming elements one by one), `fold` (aggregating elements into a single value), or `join` (combining elements from multiple streams on matching keys).
diff --git a/docs/docs/hydro/index.mdx b/docs/docs/hydro/index.mdx
index 5cb770efceb..8b571ffa7c7 100644
--- a/docs/docs/hydro/index.mdx
+++ b/docs/docs/hydro/index.mdx
@@ -3,9 +3,11 @@ sidebar_position: 0
---
# Introduction
-Hydro is a high-level distributed programming framework for Rust powered by the [DFIR runtime](../dfir/index.mdx). Unlike traditional architectures such as actors or RPCs, Hydro offers _choreographic_ APIs, where expressions and functions can describe computation that takes place across many locations. It also integrates with [Hydro Deploy](../deploy/index.md) to make it easy to deploy and run Hydro programs to the cloud.
+Hydro is a high-level distributed programming framework for Rust. Hydro can help you quickly write scalable distributed services that are correct by construction. Much like Rust helps with memory safety, Hydro helps with [**distributed safety**](./correctness.md). Hydro also makes it easy to get started by running your distributed programs in either testing or deployment modes.
-Hydro uses a two-stage compilation approach. Hydro programs are standard Rust programs, which first run on the developer's laptop to generate a _deployment plan_. This plan is then compiled to individual binaries for each machine in the distributed system (enabling zero-overhead abstractions), which are then deployed to the cloud using the generated plan along with specifications of cloud resources.
+Hydro is a distributed dataflow language, powered by the high-performance single-threaded [DFIR runtime](../dfir/index.mdx). Unlike traditional architectures such as actors or RPCs, Hydro offers _choreographic_ APIs, where expressions and functions can describe computation that takes place across many locations. It also integrates with [Hydro Deploy](../deploy/index.md) to make it easy to deploy and run distributed Hydro programs either locally or in the cloud.
+
+Hydro uses a two-stage compilation approach. Hydro programs are standard Rust programs, which first run on the developer's laptop to generate a _deployment plan_. This plan is then compiled to DFIR to generate individual binaries for each machine in the distributed system (enabling zero-overhead abstractions), which are then deployed to the cloud using the generated plan along with specifications of cloud resources.
Hydro has been used to write a variety of high-performance distributed systems, including implementations of classic distributed protocols such as two-phase commit and Paxos. Work is ongoing to develop a distributed systems standard library that will offer these protocols and more as reusable components.
diff --git a/docs/docs/hydro/live-collections/bounded-unbounded.md b/docs/docs/hydro/live-collections/bounded-unbounded.md
index 6e62674e785..51da1afa78c 100644
--- a/docs/docs/hydro/live-collections/bounded-unbounded.md
+++ b/docs/docs/hydro/live-collections/bounded-unbounded.md
@@ -3,7 +3,7 @@ sidebar_position: 0
---
# Bounded and Unbounded Types
-Although live collections can be continually updated, some collection types also support **termination**, after which no additional changes can be made. For example, a live collection created by reading integers from an in-memory `Vec` will become terminated once all the elements of the `Vec` have been loaded. But other live collections, such as one being updated by the network, may never become terminated.
+Although live collections can be continually updated, some collection types also support **termination**, after which no additional changes can be made. For example, a live collection created by reading integers from an in-memory `Vec` will become terminated once all the elements of the `Vec` have been loaded. But other live collections, such as events streaming into a service from a network, may never become terminated.
In Hydro, certain APIs are restricted to only work on collections that are **guaranteed to terminate** (**bounded** collections). All live collections in Hydro have a type parameter (typically named `B`), which tracks whether the collection is bounded (has the type `Bounded`) or unbounded (has the type `Unbounded`). These types are used in the signature of many Hydro APIs to ensure that the API is only called on the appropriate type of collection.
@@ -32,7 +32,7 @@ let input: Singleton<_, _, Bounded> = tick.singleton(q!(0));
let unbounded: Singleton<_, _, Unbounded> = input.into();
```
-Converting from an unbounded collection **to a bounded collection**, however is more complex. This requires cutting off the unbounded collection at a specific point in time, which may not be possible to do deterministically. For example, the most common way to convert an unbounded `Stream` to a bounded one is to batch its elements non-deterministically using `.tick_batch()`.
+Converting from an unbounded collection **to a bounded collection**, however is more complex. This requires cutting off the unbounded collection at a specific point in time, which may not be possible to do deterministically. For example, the most common way to convert an unbounded `Stream` to a bounded one is to batch its elements non-deterministically using `.tick_batch()`. Because this is non-deterministic, we require that it be placed in an `unsafe` block.
```rust,no_run
# use hydro_lang::*;
diff --git a/docs/docs/hydro/live-collections/determinism.md b/docs/docs/hydro/live-collections/determinism.md
index 6f7d51e1bc6..4d02c9c4bcb 100644
--- a/docs/docs/hydro/live-collections/determinism.md
+++ b/docs/docs/hydro/live-collections/determinism.md
@@ -3,7 +3,7 @@ sidebar_position: 1
---
# Eventual Determinism
-Most programs have strong guarantees on **determinism**, the property that when provided the same inputs, the outputs of the program are always the same. Even when the inputs and outputs are live collections, we can focus on the _eventual_ state of the collection (as if we froze the input and waited until the output stops changing).
+Many programs benefit from strong guarantees on **determinism**, the property that when provided the same inputs, the outputs of the program are always the same. This is critical for consistency across *replicated* services. It is also extremely helpful for predictability, including consistency across identical runs (e.g. for testing or reproducibility.) Determinism is particularly tricky to reason about in distributed systems due to the inherently non-deterministic nature of asynchronous event arrivals. However, even when the inputs and outputs of a program are live collections, we can focus on the _eventual_ state of the collection — as if we froze the input and waited until the output stopped changing.
:::info
@@ -11,18 +11,18 @@ Our consistency and safety model is based on the POPL'25 paper [Flo: A Semantic
:::
-Hydro thus guarantees **eventual determinism**: given a set of streaming inputs which have arrived, the outputs of the program will **eventually** have the same _final_ value. This makes it easy to build composable blocks of code without having to worry about runtime behavior such as batching or network delays.
+Hydro thus guarantees **eventual determinism**: given a set of specific live collections as inputs, the outputs of the program will **eventually** have the same _final_ value. Eventual determinism makes it easy to build composable blocks of code without having to worry about non-deterministic runtime behavior such as batching or network delays.
:::note
-Much existing literature in distributed systems focuses on consistency levels such as "eventual consistency" which typically correspond to guarantees when reading the state of a _replicated_ object (or set of objects) at a _specific point_ in time. Hydro does not use such a consistency model internally, instead focusing on the values local to each distributed location _over time_. Concepts such as replication, however, can be layered on top of this model.
+Much existing literature in distributed systems focuses on data consistency levels such as "eventual consistency". These typically correspond to guarantees when reading the state of a _replicated_ object (or set of objects) at a _specific point_ in time. Hydro does not use such a consistency model internally, instead focusing on the values local to each distributed location _over time_. Concepts such as replication, however, can be layered on top of this model.
:::
## Unsafe Operations in Hydro
All **safe** APIs in Hydro (the ones you can call regularly in Rust) guarantee determinism. But often it is necessary to do something non-deterministic, like generate events at a fixed wall-clock-time interval, or split an input into arbitrarily sized batches.
-Hydro offers APIs for such concepts behind an **`unsafe`** guard. This keyword is typically used to mark Rust functions that may not be memory-safe, but we reuse this in Hydro to mark non-deterministic APIs.
+Hydro offers APIs for such concepts behind an **`unsafe`** guard. This keyword is typically used to mark Rust functions that may not be memory-safe, but we reuse this in Hydro to mark APIs with non-deterministic constructs.
To call such an API, the Rust compiler will ask you to wrap the call in an `unsafe` block. It is typically good practice to also include a `// SAFETY: ...` comment to explain why the non-determinism is there.
@@ -39,9 +39,9 @@ unsafe {
}.for_each(q!(|v| println!("Sample: {:?}", v)))
```
-When writing a function with Hydro that involves `unsafe` code, it is important to be extra careful about whether the non-determinism is exposed externally. In some applications, a utility function may involve local non-determinism (such as sending retries), but not expose it outside the function (via deduplication).
+When writing a function with Hydro that involves `unsafe` code, it is important to be extra careful about whether the non-determinism is exposed externally. In some applications, a utility function may involve local non-determinism (such as sending retries), but not expose it outside the function (e.g., via deduplicating received responses).
-But other utilities may expose the non-determinism, in which case they should be marked `unsafe` as well. If the function is public, Rust will require you to put a `# Safety` section in its documentation to explain the non-determinism.
+But other functions may expose the non-determinism, in which case they should be marked `unsafe` as well. If the function is public, Rust will require you to put a `# Safety` section in its documentation to explain the non-determinism.
```rust
# use hydro_lang::*;
@@ -65,7 +65,7 @@ unsafe fn print_samples(
```
## User-Defined Functions
-Another source of potential non-determinism is user-defined functions, such as those provided to `map` or `filter`. Hydro allows for arbitrary Rust functions to be called inside these closures, so it is possible to introduce non-determinism that will not be checked by the compiler.
+Another source of potential non-determinism comes from user-defined functions or closures, such as those provided to `map` or `filter`. Hydro allows for arbitrary Rust functions to be called inside these closures, so it is possible to introduce non-determinism that will not be checked by the compiler.
In general, avoid using APIs like random number generators inside transformation functions unless that non-determinism is explicitly documented somewhere.
diff --git a/docs/docs/hydro/locations/clusters.md b/docs/docs/hydro/locations/clusters.md
index bf1b5224bdc..a644effd20e 100644
--- a/docs/docs/hydro/locations/clusters.md
+++ b/docs/docs/hydro/locations/clusters.md
@@ -3,7 +3,7 @@ sidebar_position: 1
---
# Clusters
-When building scalable distributed systems in Hydro, you'll often need to use **clusters**, which represent groups of threads all running the _same_ piece of your program (SPMD, Single-Program-Multiple-Data). They can be used to implement scale-out systems using techniques such as sharding or replication. Unlike processes, the number of threads in a cluster does not need to be static, and can be chosen during deployment.
+When building scalable distributed systems in Hydro, you'll often need to use **clusters**, which represent groups of threads all running the _same_ piece of your program (Single-Program-Multiple-Data, or "SPMD"). Hydro clusters can be used to implement scale-out systems using techniques such as sharding or replication. Unlike processes, the number of threads in a cluster does not need to be static, and can be chosen during deployment.
Like when creating a process, you can pass in a type parameter to a cluster to distinguish it from other clusters. For example, you can create a cluster with a marker of `Worker` to represent a pool of workers in a distributed system:
@@ -15,7 +15,7 @@ let flow = FlowBuilder::new();
let workers: Cluster = flow.cluster::();
```
-We can then instantiate a live collection on the cluster using the same APIs as for processes. For example, we can create a stream of integers on the worker cluster. If we launch this program, **each** member of the cluster will create a stream containing the elements 1, 2, 3, and 4:
+You can then instantiate a live collection on the cluster using the same APIs as for processes. For example, you can create a stream of integers on the worker cluster. If you launch this program, **each** member of the cluster will create a stream containing the elements 1, 2, 3, and 4:
```rust,no_run
# use hydro_lang::*;
@@ -26,7 +26,7 @@ let numbers = workers.source_iter(q!(vec![1, 2, 3, 4]));
```
## Networking
-When sending a live collection from a cluster to another location, **each** member of the cluster will send its local collection. On the receiver side, these collections will be joined together into a single stream of `(ID, Data)` tuples where the ID uniquely identifies which member of the cluster the data came from. For example, we can send a stream from the worker cluster to another process using the `send_bincode` method:
+When sending a live collection from a cluster to another location, **each** member of the cluster will send its local collection. On the receiver side, these collections will be joined together into a single stream of `(ID, Data)` tuples where the ID uniquely identifies which member of the cluster the data came from. For example, you can send a stream from the worker cluster to another process using the `send_bincode` method:
```rust
# use hydro_lang::*;
diff --git a/docs/docs/hydro/locations/index.md b/docs/docs/hydro/locations/index.md
index 7bfcaadcd65..e73929a01ba 100644
--- a/docs/docs/hydro/locations/index.md
+++ b/docs/docs/hydro/locations/index.md
@@ -1,13 +1,15 @@
# Locations and Networking
-Hydro is a **global**, **distributed** programming model. This means that the data and computation in a Hydro program can be spread across multiple machines, data centers, and even continents. To achieve this, Hydro uses the concept of **locations** to keep track of _where_ data is stored and computation is executed.
+Hydro is a **global**, **distributed** programming model. This means that the data and computation in a Hydro program can be spread across multiple machines, data centers, and even continents. To achieve this, Hydro uses the concept of **locations** to keep track of _where_ data is located and computation is executed.
-Each live collection type (`Stream`, `Singleton`, etc.) has a type parameter `L` which will always be a type that implements the `Location` trait (e.g. `Process` and `Cluster`, documented in this section). Most Hydro APIs that transform live collections will emit a new live collection with the same location type as the input, and APIs that consume multiple live collections will require them all to have the same location type.
+Each live collection type ([`Stream`](https://hydro.run/rustdoc/hydro_lang/stream/struct.Stream), [`Singleton`](https://hydro.run/rustdoc/hydro_lang/singleton/struct.Singleton) or [`Optional`](https://hydro.run/rustdoc/hydro_lang/optional/struct.Optional)) has a type parameter `L` which will always be a type that implements the `Location` trait (e.g. [`Process`](./processes.md) and [`Cluster`](./clusters.md), documented in this section). Computation has to happen at a single place, so Hydro APIs that consume multiple live collections will require all inputs to have the same location type. Moreover, most Hydro APIs that transform live collections will emit a new live collection output with the same location type as the input.
-To create distributed programs, live collections can be sent over the network using a variety of APIs. For example, `Stream`s can be sent from a process to another process using `.send_bincode(&loc2)` (which uses [bincode](https://docs.rs/bincode/latest/bincode/) as a serialization format). The sections for each location type discuss the networking APIs in further detail.
+To create distributed programs, Hydro provides a variety of API calls to allow live collections to be sent over the network. For example, `Stream`s can be sent from one process to another process using `.send_bincode(&loc2)` (which uses [bincode](https://docs.rs/bincode/latest/bincode/) as a serialization format). The sections for each location type ([`Process`](./processes.md), [`Cluster`](./clusters.md)) discuss the networking APIs in further detail.
## Creating Locations
Locations can be created by calling the appropriate method on the global `FlowBuilder` (e.g. `flow.process()` or `flow.cluster()`). These methods will return a handle to the location that can be used to create live collections and run computations.
+
+
:::caution
It is possible to create **different** locations that still have the same type, for example:
diff --git a/docs/docs/hydro/ticks-atomicity/index.md b/docs/docs/hydro/ticks-atomicity/index.md
index 8f0ac9a4bfe..1abbeac3f8d 100644
--- a/docs/docs/hydro/ticks-atomicity/index.md
+++ b/docs/docs/hydro/ticks-atomicity/index.md
@@ -1,12 +1,14 @@
# Ticks and Atomicity
By default, all live collections in Hydro are transformed **asynchronously**, which means that there may be arbitrary delays between when a live collection is updated and when downstream transformations see the updates. This is because Hydro is designed to work in a distributed setting where messages may be delayed. But for some programs, it is necessary to define local iterative loops where transformations are applied atomically; this is achieved with **ticks**.
-## Loops
+## Ticks
In some programs, you may want to process batches or snapshots of a live collection in an iterative manner. For example, in a map-reduce program, it may be helpful to compute aggregations on small local batches of data before sending those intermediate results to a reducer.
-To create such iterative loops, Hydro provides the concept of **ticks**. A **tick** captures the body of an infinite loop running locally to the machine (importantly, this means that ticks define a **logical time** which is not comparable across machines). Ticks are non-deterministically generated, so batching data into ticks is an **unsafe** operation that requires special attention.
+To create and track such iterative loops, Hydro provides the concept of **ticks**. A **tick** captures the execution of the body of an infinite loop running locally to the machine (importantly, this means that ticks define a [**logical time**](https://en.wikipedia.org/wiki/Logical_clock) which is not comparable across machines). Ticks are non-deterministically generated, so batching data into ticks is an **unsafe** operation that requires special attention.
## Atomicity
-In other programs, it is necessary to define an atomic section where a set of transformations are guaranteed to be executed **all at once**. For example, in a transaction processing program, it is important that the transaction is applied **before** an acknowledgment is sent to the client.
+In some cases it is necessary to define an atomic section where a set of transformations are guaranteed to be **executed sequentially without interrupts**. For example, in a transaction processing program, it is important that the transaction is applied **before** an acknowledgment is sent to the client.
-In Hydro, this can be achieved by placing the transaction and acknowledgment in the same atomic **tick**. Hydro guarantees that all the outputs of a tick will be computed before any are released. Importantly, atomic ticks cannot span several locations, since that would require a locking mechanism that has significant performance implications.
+In Hydro, this can be achieved by placing the transaction and acknowledgment in the same atomic **tick**. Hydro guarantees that all the outputs of a tick will be computed before any are released. Importantly, Hydro's built-in atomic ticks cannot span multiple locations.Distributed atomicity requires distributed coordination protocols (e.g. two-phase commit) that can be built in Hydro, but which have significant performance implications and are not provided by default.
+
+