Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crud: add readview support #372

Merged
merged 2 commits into from
Sep 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 11 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,17 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## Unreleased

### Added
* Read view support for select and pairs (#343).

## [1.2.0] - 07-06-23

### Added
* Add `noreturn` option for operations:
`insert`, `insert_object`, `insert_many`, `insert_object_many`,
`replace`, `replace_object`, `replace_many`, `insert_object_many`,
* Add `noreturn` option for operations:
`insert`, `insert_object`, `insert_many`, `insert_object_many`,
`replace`, `replace_object`, `replace_many`, `insert_object_many`,
`upsert`, `upsert_object`, `upsert_many`, `upsert_object_many`,
`update`, `delete` (#267).

Expand Down Expand Up @@ -39,16 +44,16 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
## [1.0.0] - 02-02-23

### Added
* Add timeout condition for the validation of master presence in
* Add timeout condition for the validation of master presence in
replicaset and for the master connection (#95).
* Support Cartridge clusterwide configuration for `crud.cfg` (#332).

### Changed
* **Breaking**: forbid using space id in `crud.len` (#255).

### Fixed
* Add validation of the master presence in replicaset and the
better0fdead marked this conversation as resolved.
Show resolved Hide resolved
master connection to the `utils.get_space` method before
* Add validation of the master presence in replicaset and the
master connection to the `utils.get_space` method before
receiving the space from the connection (#331).
* Fix fiber cancel on schema reload timeout in `call_reload_schema` (PR #336).

Expand Down
210 changes: 190 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,12 @@ It also provides the `crud-storage` and `crud-router` roles for
- [Count](#count)
- [Call options for crud methods](#call-options-for-crud-methods)
- [Statistics](#statistics)
- [Read view](#read-view)
- [Creating a read view](#creating-a-read-view)
- [Closing a read view](#closing-a-read-view)
- [Read view select](#read-view-select)
- [Read view select conditions](#read-view-select-conditions)
- [Read view pairs](#read-view-pairs)
- [Cartridge roles](#cartridge-roles)
- [Usage](#usage)
- [License](#license)
Expand Down Expand Up @@ -237,8 +243,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuple
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array contains one inserted row, error.
Expand Down Expand Up @@ -308,8 +314,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuples
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array with inserted rows, array of errors.
Expand Down Expand Up @@ -450,8 +456,8 @@ where:
vshard router instance. Set this parameter if your space is not
a part of the default vshard cluster
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array contains one row, error.
Expand Down Expand Up @@ -493,8 +499,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuple
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array contains one updated row, error.
Expand Down Expand Up @@ -535,8 +541,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuple
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array contains one deleted row (empty for vinyl), error.
Expand Down Expand Up @@ -588,8 +594,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuple
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns inserted or replaced rows and metadata or nil with error.
Expand Down Expand Up @@ -659,8 +665,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuples
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array with inserted/replaced rows, array of errors.
Expand Down Expand Up @@ -801,8 +807,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuple
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and empty array of rows or nil, error.
Expand Down Expand Up @@ -868,8 +874,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuples
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array of errors.
Expand Down Expand Up @@ -1014,8 +1020,8 @@ where:
* `yield_every` (`?number`) - number of tuples processed on storage to yield after,
`yield_every` should be > 0, default value is 1000
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default


Expand Down Expand Up @@ -1541,6 +1547,170 @@ support preserving stats between role reload
(see [tarantool/metrics#334](https://github.com/tarantool/metrics/issues/334)),
thus this feature will be unsupported for `metrics` driver.

### Read view

A read view is an in-memory snapshot of data on instance that isn’t affected by future data modifications. Read views allow you to retrieve data using the `read_view_object:select()` and `read_view_object:pairs()` operations.

Read views can be used to make complex analytical queries. This reduces the load on the main database and improves RPS for a single Tarantool instance.

Read views have the following limitations:

* Only the memtx engine is supported.
* Read view can be used starting from Tarantool Enterprise v2.11.0.
better0fdead marked this conversation as resolved.
Show resolved Hide resolved
* There is no clusterwide readview support. For a sharded cluster, we open a readview on each storage. Due to a cluster's distributed nature, it is not guaranteed that they will open simultaneously.

#### Creating a read view

To create a read view, call the `crud.readview()` function.

```lua
local rv = crud.readview(opts)
```

where:

* `opts`:
* `name` (`?string`) - name of the read view
* `timeout` (`?number`) - `vshard.call` timeout (in seconds)

**Example:**

```lua
local rv = crud.readview({name = 'foo', timeout = 3})
```

#### Closing a read view

When a read view is no longer needed, close it using the `read_view_object:close()` method because a read view may consume a substantial amount of memory.

```lua
local rv = crud.readview()
rv:close(opts)
```

where:

* `opts`:
* `timeout` (`?number`) - `vshard.call` timeout (in seconds)

A read view is also closed implicitly when the read view object is collected by the Lua garbage collector.

**Example:**

```lua
local rv = crud.readview()
rv:close({timeout = 3})
```

#### Read view select

`read_view_object:select()` supports multi-conditional selects, treating a cluster as a single space, same as `crud.select`.

```lua
local rv = crud.readview()
local objects, err = rv:select(space_name, conditions, opts)
rv:close()
```

Opts are the same as [select opts](#select), except `balance`, `prefer_replica` and `mode` are not supported.

Returns metadata and array of rows, error.

**Example:**

```lua
local rv = crud.readview()
rv:select('customers', nil, {batch_size=1, fullscan=true})
---
- metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
{'name': 'name', 'type': 'string'}, {'name': 'age', 'type': 'number'}]
rows:
- [1, 477, 'Elizabeth', 12]
- [2, 401, 'Mary', 46]
- [3, 2804, 'David', 33]
- [4, 1161, 'William', 81]
- [5, 1172, 'Jack', 35]
- [6, 1064, 'William', 25]
- [7, 693, 'Elizabeth', 18]
- null
...
crud.insert('customers', {8, box.NULL, 'Elizabeth', 23})
---
- rows:
- [8, 185, 'Elizabeth', 23]
metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
{'name': 'name', 'type': 'string'}, {'name': 'age', 'type': 'number'}]
- null
...
rv:select('customers', nil, {batch_size=1, fullscan=true})
---
- metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
{'name': 'name', 'type': 'string'}, {'name': 'age', 'type': 'number'}]
rows:
- [1, 477, 'Elizabeth', 12]
- [2, 401, 'Mary', 46]
- [3, 2804, 'David', 33]
- [4, 1161, 'William', 81]
- [5, 1172, 'Jack', 35]
- [6, 1064, 'William', 25]
- [7, 693, 'Elizabeth', 18]
- null
...
rv:close()
```

##### Read view select conditions

Select conditions for `read_view_object:select()` are the same as [select conditions](#select-conditions) for `crud.select`.

**Example:**

```lua
rv = crud.readview()
rv:select('customers', {{'<=', 'age', 35}}, {first = 10})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [5, 1172, 'Jack', 35]
- [3, 2804, 'David', 33]
- [6, 1064, 'William', 25]
- [7, 693, 'Elizabeth', 18]
- [1, 477, 'Elizabeth', 12]
...
rv.close()
```

#### Read view pairs

You can iterate across a distributed space using the `read_view_object:pairs()` method.
Its arguments are the same as [`crud.readview.select`](#read-view-select) arguments except
`fullscan` (it does not exist because `crud.pairs` does not generate a critical
log entry on potentially long requests) and negative `first` values aren't
allowed.
User could pass `use_tomap` flag (`false` by default) to iterate over flat tuples or objects.

**Example:**

```lua
rv = crud.readview()
local tuples = {}
for _, tuple in rv:pairs('customers', {{'<=', 'age', 35}}, {use_tomap = false}) do
-- {5, 1172, 'Jack', 35}
table.insert(tuples, tuple)
end

local objects = {}
for _, object in rv:pairs('customers', {{'<=', 'age', 35}}, {use_tomap = true}) do
-- {id = 5, name = 'Jack', bucket_id = 1172, age = 35}
table.insert(objects, object)
end
rv:close()
```

## Cartridge roles

`cartridge.roles.crud-storage` is a Tarantool Cartridge role that depends on the
Expand Down
6 changes: 6 additions & 0 deletions crud.lua
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ local borders = require('crud.borders')
local sharding_metadata = require('crud.common.sharding.sharding_metadata')
local utils = require('crud.common.utils')
local stats = require('crud.stats')
local readview = require('crud.readview')

local crud = {}

Expand Down Expand Up @@ -147,6 +148,10 @@ crud.reset_stats = stats.reset
-- @function storage_info
crud.storage_info = utils.storage_info

-- @refer readview.new
-- @function readview
crud.readview = readview.new
better0fdead marked this conversation as resolved.
Show resolved Hide resolved

--- Initializes crud on node
--
-- Exports all functions that are used for calls
Expand Down Expand Up @@ -174,6 +179,7 @@ function crud.init_storage()
count.init()
borders.init()
sharding_metadata.init()
readview.init()

_G._crud.storage_info_on_storage = utils.storage_info_on_storage
end
Expand Down
1 change: 1 addition & 0 deletions crud/common/stash.lua
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ stash.name = {
stats_metrics_registry = '__crud_stats_metrics_registry',
ddl_triggers = '__crud_ddl_spaces_triggers',
select_module_compat_info = '__select_module_compat_info',
storage_readview = '__crud_storage_readview',
}

--- Setup Tarantool Cartridge reload.
Expand Down
Loading