-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[doc] First draft of PARTITION BY documentation #30485
base: main
Are you sure you want to change the base?
Conversation
9e9442c
to
fb753f7
Compare
Hi Ben -- just double-checking -- for things in |
Yes that's right! Normally folks only review drafts on special request. Will definitely tag you in when it's ready for a look, though... hopefully soon! |
fb753f7
to
38f5f40
Compare
|
||
## Syntax | ||
|
||
The option `PARTITION BY <column list>` declares that a [materialized view](/sql/create-materialized-view/#with_options), [table](/sql/create-table/#with_options), or source table should be ordered by the listed columns. For example, a table that stores an append-only collection of events may look like: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the MV docs as part of this PR as well.
I have not updated the table docs yet: those do not have a separate section for "with options" yet. I'm happy to add such a section and move the retain-history option to it if that sounds good to you.
AFAICT "create table ... from source` is not documented anywhere yet, so I have nothing to link to here. I'm open to suggestions!
Internally, Materialize stores these durable collections in an [LSM-tree](https://en.wikipedia.org/wiki/Log-structured_merge-tree)-like structure. Each collection is made up of a set of | ||
**runs** of data, each run is sorted and then split up into individual **parts**, and those parts are written to object storage and retrieved only when necessary to satisfy a query. Materialize will also periodically **compact** the data it stores to consolidate small parts into larger ones or discard deleted rows. | ||
|
||
Materialize lets you specify the ordering it will use to sort these runs of data internally. A well-chosen sort order can unlock optimizations like [filter pushdown](#filter-pushdown), which in turn can make queries and other operations more efficient. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the "patterns" section is where we put this sort of doc that covers features that aren't specific to a single type and involve some discussion of implementation details, but let me know if you'd like me to move it!
```mzsql | ||
EXPLAIN FILTER PUSHDOWN FOR | ||
SELECT * FROM events WHERE event_ts + '2 minutes' > mz_now(); | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we'd link to EXPLAIN FILTER PUSHDOWN
, but that's not documented anywhere to my knowledge yet. That may make sense as a separate PR?
@@ -0,0 +1,143 @@ | |||
--- | |||
title: "PARTITION BY" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc covers both the PARTITION BY
syntax and the filter pushdown optimization, which are strictly speaking two separate things but likely to be used together I think?
Okay, I think this is ready for a first look? I've left comments on some places where I'm particularly uncertain about my choices, but I'd also be happy for general feedback on "does this make sense" etc. |
Motivation
A first draft of public documentation for https://github.com/MaterializeInc/database-issues/issues/7188.
Tips for reviewer
To be merged only when the feature hits private preview.
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.