-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What should "includeHistory" actually do? #44
Comments
Working out scenarios in detailLet's say we have 2 paths, and each has a history of 3 documents (
We do a basic query that matches some of them...
Here are ways we might want to filter further with a fancier query:
Use casesWhy do all these things? Some apps will be doing custom conflict resolution so they'll generally want full histories. Some apps will just use the simple last-write-wins that's built into Earthstar. Besides in-app searching and filtering, these might also be useful for sync queries. Here's some examples for a wiki app, searching for pages authored by me:
Which queries are fast vs slow?Assuming we've already tagged the latest document with "isHead = true" in the database...
Note "head" means the latest overall, and "latest" means the latest of the matches. The operations are:
Which is:
Turning that into query parametersWe could have a query parameter like this: // in same order as above
historyMode:
'matching-heads'
| 'matching-heads-plus-all-history'
| 'latest-matching-versions'
| 'matching-versions'
| 'matching-versions-plus-all-history'
| 'any-heads-that-have-matches-in-history' Or would it be better to break this into 2 or 3 separate query parameters? |
Besides authors, we can do other operations on the set of document versions for a given path. For example, timestamps: Timestamps
|
In the Using the language from previous comments above:
Comments from the beta source code for further details: https://github.com/earthstar-project/earthstar/blob/beta/src/storage/query.ts#L48-L57 /**
* Query objects describe how to query a Storage instance for documents.
*
* An empty query object returns all latest documents.
* Each of the following properties adds an additional filter,
* narrowing down the results further.
* The exception is that history = 'latest' by default;
* set it to 'all' to include old history documents also.
*/
export interface Query { /**
* Document author.
*
* With history:'latest' this only returns documents for which
* this author is the latest author.
*
* With history:'all' this returns all documents by this author,
* even if those documents are not the latest ones anymore.
*/
author?: AuthorAddress, * If query.history === 'all', we can do an easy query:
*
* ```
* SELECT * from DOCS
* WHERE path = "/abc"
* AND timestamp > 123
* ORDER BY path ASC, author ASC
* LIMIT 123
* ```
*
* If query.history === 'latest', we have to do something more complicated.
* We don't want to filter out some docs, and THEN get the latest REMAINING
* docs in each path.
* We want to first get the latest doc per path, THEN filter those.
*
* ```
* SELECT *, MAX(timestamp) from DOCS
* -- first level of filtering happens before we choose the latest doc.
* -- here we can only do things that are the same for all docs in a path.
* WHERE path = "/abc"
* -- now group by path and keep the newest one
* GROUP BY path
* -- finally, second level of filtering happens AFTER we choose the latest doc.
* -- these are things that can differ for docs within a path
* HAVING timestamp > 123
* ORDER BY path ASC, author ASC
* LIMIT 123
* ``` |
Document versions with the same path are related to each other. When querying, sometimes we want to handle them as a group and sometimes individually.
We only have one query parameter for this,
includeHistory
, and it doesn't let us do everything we want.(vocabulary: a "head" is the latest document at a path)
We might want to...
This complexity will hit any kind of query that can match only certain document versions in a path. We previously ran into this with querying by author. At the time I solved that by adding 3 ways to query by author:
participatingAuthor
: match author anywhere in history; includeHistory happens after thatversionsByAuthor
: includeHistory happens first, then match author in each version one by onelastAuthor
: only match author on latest doc version; includeHistory expands after thatThis is confusing. Is there a more general way to specify how to handle history when querying?
The text was updated successfully, but these errors were encountered: