brimdata · philrz · Sep 16, 2024 · Sep 11, 2024 · Sep 13, 2024 · Sep 14, 2024
diff --git a/docs/README.md b/docs/README.md
@@ -45,7 +45,7 @@ same abstract Zed data model.
 * A [Zed lake](commands/zed.md) is a collection of Zed data stored
 across one or more [data pools](commands/zed.md#data-pools) with ACID commit semantics and
 accessed via a [Git](https://git-scm.com/)-like API.
-* The [Zed language](language/README.md) is the system's dataflow language for performing
+* The [Zed language](language/README.md) is the system's pipeline language for performing
 queries, searches, analytics, transformations, or any of the above combined together.
 * A  [Zed query](language/overview.md) is a Zed script that performs
 search and/or analytics.

diff --git a/docs/commands/zed.md b/docs/commands/zed.md
@@ -263,7 +263,7 @@ As pool data is often comprised of Zed records (analogous to JSON objects),
 the pool key is typically a field of the stored records.
 When pool data is not structured as records/objects (e.g., scalar or arrays or other
 non-record types), then the pool key would typically be configured
-as the [special value `this`](../language/dataflow-model.md#the-special-value-this).
+as the [special value `this`](../language/pipeline-model.md#the-special-value-this).
 
 Data can be efficiently scanned if a query has a filter operating on the pool
 key.  For example, on a pool with pool key `ts`, the query `ts == 100`
@@ -396,7 +396,7 @@ The `-orderby` option indicates the pool key that is used to sort
 the data in lake, which may be in ascending or descending order.
 
 If a pool key is not specified, then it defaults to
-the [special value `this`](../language/dataflow-model.md#the-special-value-this).
+the [special value `this`](../language/pipeline-model.md#the-special-value-this).
 
 A newly created pool is initialized with a branch called `main`.
 

diff --git a/docs/commands/zq.md b/docs/commands/zq.md
@@ -85,7 +85,7 @@ emits
 ```mdtest-output
 2
 ```
-Note here that the query `1+1` [implies](../language/dataflow-model.md#implied-operators)
+Note here that the query `1+1` [implies](../language/pipeline-model.md#implied-operators)
 `yield 1+1`.
 
 ## Input Formats
@@ -478,7 +478,7 @@ If you are ever stumped about how the `zq` compiler is parsing your query,
 you can always run `zq -C` to compile and display your query in canonical form
 without running it.
 This can be especially handy when you are learning the language and
-[its shortcuts](../language/dataflow-model.md#implied-operators).
+[its shortcuts](../language/pipeline-model.md#implied-operators).
 
 For example, this query
 ```mdtest-command

diff --git a/docs/language/README.md b/docs/language/README.md
@@ -2,7 +2,7 @@
 
 The language documents:
 * provide an [overview](overview.md) of the Zed language,
-* describe Zed's [dataflow model](dataflow-model.md),
+* describe Zed's [pipeline model](pipeline-model.md),
 * explain Zed's [data types](data-types.md),
 * show the syntax of [statements](statements.md) that define constants, functions, operators, and named types,
 * describe the syntax of [expressions](expressions.md) and [search expressions](search-expressions.md),

diff --git a/docs/language/expressions.md b/docs/language/expressions.md
@@ -6,9 +6,9 @@ sidebar_label: Expressions
 # Expressions
 
 Zed expressions follow the typical patterns in programming languages.
-Expressions are typically used within data flow operators
+Expressions are typically used within pipeline operators
 to perform computations on input values and are typically evaluated once per each
-input value [`this`](dataflow-model.md#the-special-value-this).
+input value [`this`](pipeline-model.md#the-special-value-this).
 
 For example, `yield`, `where`, `cut`, `put`, `sort` and so forth all take
 various expressions as part of their operation.
@@ -109,7 +109,7 @@ where `<id>` is an identifier representing the field name referenced.
 If a field name is not representable as an identifier, then [indexing](#indexing)
 may be used with a quoted string to represent any valid field name.
 Such field names can be accessed using
-[`this`](dataflow-model.md#the-special-value-this) and an array-style reference, e.g.,
+[`this`](pipeline-model.md#the-special-value-this) and an array-style reference, e.g.,
 `this["field with spaces"]`.
 
 If the dot operator is applied to a value that is not a record
@@ -353,7 +353,7 @@ where a `<spec>` has one of three forms:
 ```
 The first form is a customary colon-separated field and value similar to JavaScript,
 where `<field>` may be an identifier or quoted string.
-The second form is an [implied field reference](dataflow-model.md#implied-field-references)
+The second form is an [implied field reference](pipeline-model.md#implied-field-references)
 `<ref>`, which is shorthand for `<ref>:<ref>`.  The third form is the `...`
 spread operator which expects a record value as the result of `<expr>` and
 inserts all of the fields from the resulting record.

diff --git a/docs/language/lateral-subqueries.md b/docs/language/lateral-subqueries.md
@@ -7,7 +7,7 @@ sidebar_label: Lateral Subqueries
 
 Lateral subqueries provide a powerful means to apply a Zed query
 to each subsequence of values generated from an outer sequence of values.
-The inner query may be _any_ dataflow operator sequence (excluding
+The inner query may be _any_ pipeline operator sequence (excluding
 [`from` operators](operators/from.md)) and may refer to values from
 the outer sequence.
 
@@ -91,7 +91,7 @@ In the field reference form, a single identifier `<field>` refers to a field
 in the parent scope and makes that field's value available in the lateral scope
 via the same name.
 
-Note that any such variable definitions override [implied field references](dataflow-model.md#implied-field-references) of
+Note that any such variable definitions override [implied field references](pipeline-model.md#implied-field-references) of
 `this`. If a both a field named `x` and a variable named `x` need be
 referenced in the lateral scope, the field reference should be qualified as
 `this.x` while the variable is referenced simply as `x`.
@@ -102,7 +102,7 @@ the value `this` refers to the inner sequence generated from the `over` expressi
 This query runs to completion for each inner sequence and emits
 each subquery result as each inner sequence traversal completes.
 
-This structure is powerful because _any_ dataflow operator sequence (excluding
+This structure is powerful because _any_ pipeline operator sequence (excluding
 [`from` operators](operators/from.md)) can appear in the body of
 the lateral scope.  In contrast to the [`yield`](operators/yield.md) example above, a [`sort`](operators/sort.md) could be
 applied to each subsequence in the subquery, where `sort`
@@ -128,13 +128,13 @@ parenthesized form:
 ```
 
 :::tip
-The parentheses disambiguate a lateral expression from a [lateral dataflow operator](operators/over.md).
+The parentheses disambiguate a lateral expression from a [lateral pipeline operator](operators/over.md).
 :::
 
 This form must always include a [lateral scope](#lateral-scope) as indicated by `<lateral>`.
 
 The lateral expression is evaluated by evaluating each `<expr>` and feeding
-the results as inputs to the `<lateral>` dataflow operators.  Each time the
+the results as inputs to the `<lateral>` pipeline.  Each time the
 lateral expression is evaluated, the lateral operators are run to completion,
 e.g.,
 ```mdtest-command
@@ -162,7 +162,7 @@ produces
 {sorted:[1,2,3],sum:6}
 ```
 Because Zed expressions evaluate to a single result, if multiple values remain
-at the conclusion of the lateral dataflow, they are automatically wrapped in
+at the conclusion of the lateral pipeline, they are automatically wrapped in
 an array, e.g.,
 ```mdtest-command
 echo '{x:1} {x:[2]} {x:[3,4]}' |
@@ -174,7 +174,7 @@ produces
 {s:3}
 {s:[4,5]}
 ```
-To handle such dynamic input data, you can ensure your downstream dataflow
+To handle such dynamic input data, you can ensure your downstream pipeline
 always receives consistently packaged values by explicitly wrapping the result
 of the lateral scope, e.g.,
 ```mdtest-command

diff --git a/docs/language/operators/README.md b/docs/language/operators/README.md
@@ -2,25 +2,25 @@
 
 ---
 
-Dataflow operators process a sequence of input values to create an output sequence
-and appear as the components of a dataflow pipeline. In addition to the built-in
+Operators process a sequence of input values to create an output sequence
+and appear as the components of a [pipeline](../pipeline-model.md). In addition to the built-in
 operators listed below, Zed also allows for the creation of
 [user-defined operators](../statements.md#operator-statements).
 
 * [assert](assert.md) - evaluate an assertion
-* [combine](combine.md) - combine parallel paths into a single output
+* [combine](combine.md) - combine parallel pipeline branches into a single output
 * [cut](cut.md) - extract subsets of record fields into new records
 * [debug](debug.md) - write intermediate values to stderr
 * [drop](drop.md) - drop fields from record values
 * [file](from.md) - source data from a file
-* [fork](fork.md) - copy values to parallel paths
+* [fork](fork.md) - copy values to parallel pipeline branches
 * [from](from.md) - source data from pools, files, or URIs
 * [fuse](fuse.md) - coerce all input values into a merged type
 * [get](from.md) - source data from a URI
 * [head](head.md) - copy leading values of input sequence
 * [join](join.md) - combine data from two inputs using a join predicate
 * [load](load.md) - add and commit data to a pool
-* [merge](merge.md) - combine parallel paths into a single, ordered output
+* [merge](merge.md) - combine parallel pipeline branches into a single, ordered output
 * [over](over.md) - traverse nested values as a lateral query
 * [pass](pass.md) - copy input values to output
 * [put](put.md) - add or modify fields of records

diff --git a/docs/language/operators/combine.md b/docs/language/operators/combine.md
@@ -1,6 +1,6 @@
 ### Operator
 
-&emsp; **combine** &mdash; combine parallel paths into a single output
+&emsp; **combine** &mdash; combine parallel pipeline branches into a single output
 
 ### Synopsis
 
@@ -9,8 +9,8 @@
 ```
 ### Description
 
-The implied `combine` operator merges inputs from multiple upstream legs of
-the dataflow path into a single output.  The order of values in the combined
+The implied `combine` operator merges inputs from multiple upstream branches of
+the pipeline into a single output.  The order of values in the combined
 output is undefined.
 
 You need not explicit reference the operator with any text.  Instead, the
@@ -19,7 +19,7 @@ and its semantics of undefined merge order.
 
 ### Examples
 
-_Copy input to two paths and combine with the implied operator_
+_Copy input to two pipeline branches and combine with the implied operator_
 ```mdtest-command
 echo '1 2' | zq -z 'fork (=>pass =>pass) | sort this' -
 ```

diff --git a/docs/language/operators/cut.md b/docs/language/operators/cut.md
@@ -10,7 +10,7 @@ cut <field>[:=<expr>] [, <field>[:=<expr>] ...]
 ### Description
 
 The `cut` operator extracts values from each input record in the
-form of one or more [field assignments](../dataflow-model.md#field-assignments),
+form of one or more [field assignments](../pipeline-model.md#field-assignments),
 creating one field for each expression.  Unlike the `put` operator,
 which adds or modifies the fields of a record, `cut` retains only the
 fields enumerated, much like a SQL projection.

diff --git a/docs/language/operators/fork.md b/docs/language/operators/fork.md
@@ -1,28 +1,28 @@
 ### Operator
 
-&emsp; **fork** &mdash; copy values to parallel paths
+&emsp; **fork** &mdash; copy values to parallel pipeline branches
 
 ### Synopsis
 
 ```
 fork (
-  => <leg>
-  => <leg>
+  => <branch>
+  => <branch>
   ...
 )
 ```
 ### Description
 
-The `fork` operator copies each input value to multiple, parallel legs of
-the dataflow path.
+The `fork` operator copies each input value to multiple, parallel branches of
+the pipeline.
 
-The output of a fork consists of multiple legs that must be merged.
-If the downstream operator expects a single input, then the output legs are
+The output of a fork consists of multiple branches that must be merged.
+If the downstream operator expects a single input, then the output branches are
 merged with an automatically inserted [combine operator](combine.md).
 
 ### Examples
 
-_Copy input to two paths and merge_
+_Copy input to two pipeline branches and merge_
 ```mdtest-command
 echo '1 2' | zq -z 'fork (=>pass =>pass) | sort this' -
 ```

diff --git a/docs/language/operators/from.md b/docs/language/operators/from.md
@@ -10,10 +10,10 @@ from <pattern>
 file <path> [format <format>]
 get <uri> [format <format>]
 from (
-   pool <pool>[@<commitish>] [ => <leg> ]
+   pool <pool>[@<commitish>] [ => <branch> ]
    pool <pattern>
-   file <path> [format <format>] [ => <leg> ]
-   get <uri> [format <format>] [ => <leg> ]
+   file <path> [format <format>] [ => <branch> ]
+   get <uri> [format <format>] [ => <branch> ]
    pass
    ...
 )
@@ -26,7 +26,7 @@ their data to its output.  A data source can be
 * the names of multiple data pools, expressed as a [regular expression](../search-expressions.md#regular-expressions) or [glob](../search-expressions.md#globs) pattern;
 * a path to a file;
 * an HTTP, HTTPS, or S3 URI; or
-* the [`pass` operator](pass.md), to treat the upstream data path as a source.
+* the [`pass` operator](pass.md), to treat the upstream pipeline branch as a source.
 
 :::tip Note
 File paths and URIs may be followed by an optional [format](../../commands/zq.md#input-formats) specifier.
@@ -45,7 +45,7 @@ In the first four forms, a single source is connected to a single output.
 In the fifth form, multiple sources are accessed in parallel and may be
 [joined](join.md), [combined](combine.md), or [merged](merge.md).
 
-A data path can be split with the [`fork` operator](fork.md) as in
+A pipeline can be split with the [`fork` operator](fork.md) as in
 ```
 from PoolOne | fork (
   => op1 | op2 | ...
@@ -61,7 +61,7 @@ from (
 ) | join on key=key | ...
 ```
 
-Similarly, data can be routed to different paths with replication
+Similarly, data can be routed to different pipeline branches with replication
 using the [`switch` operator](switch.md):
 ```
 from ... | switch color (

diff --git a/docs/language/operators/merge.md b/docs/language/operators/merge.md
@@ -1,6 +1,6 @@
 ### Operator
 
-&emsp; **merge** &mdash; combine parallel paths into a single, ordered output
+&emsp; **merge** &mdash; combine parallel pipeline branches into a single, ordered output
 
 ### Synopsis
 
@@ -9,14 +9,14 @@
 ```
 ### Description
 
-The `merge` operator merges inputs from multiple upstream legs of
-the dataflow path into a single output.  The order of values in the combined
+The `merge` operator merges inputs from multiple upstream branches of
+the pipeline into a single output.  The order of values in the combined
 output is determined by the `<expr>` arguments, which act as sort expressions
-where the values from the upstream paths are forwarded based on these expressions.
+where the values from the upstream pipeline branches are forwarded based on these expressions.
 
 ### Examples
 
-_Copy input to two paths and combine_
+_Copy input to two pipeline branches and merge_
 ```mdtest-command
 echo '1 2' | zq -z 'fork (=>pass =>pass) | merge this' -
 ```

diff --git a/docs/language/operators/over.md b/docs/language/operators/over.md
@@ -13,7 +13,7 @@ of derived values (e.g., the elements of an array) and either
 (in the first form) sends the new values directly to its output or
 (in the second form) sends the values to a scoped computation as indicated
 by `<lateral>`, which may represent any Zed [subquery](../lateral-subqueries.md) operating on the
-derived sequence of values as [`this`](../dataflow-model.md#the-special-value-this).
+derived sequence of values as [`this`](../pipeline-model.md#the-special-value-this).
 
 Each expression `<expr>` is evaluated in left-to-right order and derived sequences are
 generated from each such result depending on its types:

diff --git a/docs/language/operators/pass.md b/docs/language/operators/pass.md
@@ -10,7 +10,7 @@ pass
 ### Description
 
 The `pass` operator outputs a copy of each input value. It is typically used
-with operators that handle multiple legs of the dataflow path such as
+with operators that handle multiple branches of the pipeline such as
 [`fork`](fork.md) and [`join`](join.md).
 
 ### Examples
@@ -26,7 +26,7 @@ echo '1 2 3' | zq -z pass -
 3
 ```
 
-_Copy each input value to three parallel legs and leave the values unmodified on one of them_
+_Copy each input value to three parallel pipeline branches and leave the values unmodified on one of them_
 ```mdtest-command
 echo '"HeLlo, WoRlD!"' | zq -z '
   fork (

diff --git a/docs/language/operators/put.md b/docs/language/operators/put.md
@@ -9,7 +9,7 @@
 ### Description
 
 The `put` operator modifies its input with
-one or more [field assignments](../dataflow-model.md#field-assignments).
+one or more [field assignments](../pipeline-model.md#field-assignments).
 Each expression is evaluated based on the input record
 and the result is either assigned to a new field of the input record if it does not
 exist, or the existing field is modified in its original location with the result.
@@ -23,7 +23,7 @@ a computed value cannot be referenced in another expression.  If you need
 to re-use a computed result, this can be done by chaining multiple `put` operators.
 
 The `put` keyword is optional since it is an
-[implied operator](../dataflow-model.md#implied-operators).
+[implied operator](../pipeline-model.md#implied-operators).
 
 Each `<field>` expression must be a field reference expressed as a dotted path or one more
 constant index operations on `this`, e.g., `a.b`, `this["a"]["b"]`,

diff --git a/docs/language/operators/search.md b/docs/language/operators/search.md
@@ -13,7 +13,7 @@ to each input value and dropping each value for which the expression evaluates
 to `false` or to an error.
 
 The `search` keyword is optional since it is an
-[implied operator](../dataflow-model.md#implied-operators).
+[implied operator](../pipeline-model.md#implied-operators).
 
 When Zed queries are run interactively, it is convenient to be able to omit
 the "search" keyword, but when search filters appear in Zed source files,

diff --git a/docs/language/operators/sort.md b/docs/language/operators/sort.md
@@ -66,7 +66,7 @@ echo '2 null 1 3' | zq -z 'sort this' -
 3
 null
 ```
-_With no sort expression, sort will sort by [`this`](../dataflow-model.md#the-special-value-this) for non-records_
+_With no sort expression, sort will sort by [`this`](../pipeline-model.md#the-special-value-this) for non-records_
 ```mdtest-command
 echo '2 null 1 3' | zq -z sort -
 ```

diff --git a/docs/language/operators/summarize.md b/docs/language/operators/summarize.md
@@ -24,7 +24,7 @@ unique combination of values of the group-by keys specified after the `by`
 keyword.
 
 The `summarize` keyword is optional since it is an
-[implied operator](../dataflow-model.md#implied-operators).
+[implied operator](../pipeline-model.md#implied-operators).
 
 Each aggregate function may be optionally followed by a `where` clause, which
 applies a Boolean expression that indicates, for each input value,