-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Adding Iterations over Secondary Indexes #50
Comments
One more missing feature which is necessary -- when an iteration is performed using an index, the iteration must stop once the predicate function returns false. Seems that it must be a special mode enabled by a flag. Also, a specia flag will be needed for selecting the direction of iteration. |
Please provide transaction support in callback, it's needed to delete tuples from dependent spaces. |
The start element in the example above is calculated once. The usual case is to expire tuples relatively to a current time (say, ones that have creation or modification time more than X days ago). So I guess it should be a callback. |
We should also describe, when traversing a space starting from a key is not available (and give a meaningful error in the case).
|
Now only hash and tree work, the rest of the indices types will be simply ignored, the task will start, but will not do anything |
Meeting notes 23.04.2020:
|
Meeting notes 27.04.2020:
|
Proposed Apiafter 27.04.2020 meeting expirationd.start("clean_all", box.space.to_expire.id,
function() return true end,
{
-- default is primary key
index = 'exp',
-- one transaction per batch
-- default is false
atomic_iteration = true,
-- delete data that was added a year ago
-- default return nil
start_key = function( task )
return clock.time() - (365*24*60*60)
end,
-- delete it from the oldest to the newest
-- default is ALL
iterator_type = 'LE',
-- stop full_scan if delete a lot
-- default return true
process_while = function( task )
if task.args.max_expired_tuples >= task.expired_tuples_count then
task.expired_tuples_count = 0
return false
end
return true
end,
-- this function should be default if no other is specified
iterate_with = function( task )
return task.expire_index:pairs({ task.start_key() }, { iterator = task.iterator })
:take_while( function( tuple )
return task:process_while()
end )
end,
args = {
max_expired_tuples = 1000
}
}
) |
Iterations over Secondary Indexes; Transaction support; Start key; Iterator type; Process while function; And the ability to write custom iterator behavior for the user. Rewritten space iteration for tree index from select to pairs. Also describe the new parameters for iterating over secondary indexes; And added some comments and removed duplicate code. Co-authored-by: Nick Volynkin <nick.volynkin@gmail.com>
Iterations over Secondary Indexes; Transaction support; Start key; Iterator type; Process while function; And the ability to write custom iterator behavior for the user. Rewritten space iteration for tree index from select to pairs. Also describe the new parameters for iterating over secondary indexes; And added some comments and removed duplicate code. Co-authored-by: Nick Volynkin <nick.volynkin@gmail.com>
Iterations over Secondary Indexes; Transaction support; Start key; Iterator type; Process while function; And the ability to write custom iterator behavior for the user. Rewritten space iteration for tree index from select to pairs. Also describe the new parameters for iterating over secondary indexes; And added some comments and removed duplicate code. Co-authored-by: Nick Volynkin <nick.volynkin@gmail.com>
Iterations over Secondary Indexes; Transaction support; Start key; Iterator type; Process while function; And the ability to write custom iterator behavior for the user. Rewritten space iteration for tree index from select to pairs. Also describe the new parameters for iterating over secondary indexes; And added some comments and removed duplicate code. Co-authored-by: Nick Volynkin <nick.volynkin@gmail.com>
Iterations over Secondary Indexes; Transaction support; Start key; Iterator type; Process while function; And the ability to write custom iterator behavior for the user. Rewritten space iteration for tree index from select to pairs. Also describe the new parameters for iterating over secondary indexes; And added some comments and removed duplicate code. Co-authored-by: Nick Volynkin <nick.volynkin@gmail.com>
Iterations over Secondary Indexes; Transaction support; Start key; Iterator type; Process while function; And the ability to write custom iterator behavior for the user. Rewritten space iteration for tree index from select to pairs. Also describe the new parameters for iterating over secondary indexes; And added some comments and removed duplicate code. Co-authored-by: Nick Volynkin <nick.volynkin@gmail.com>
Were added: * Iterations over Secondary Indexes - name of the index to iterate on * Transaction support - ability to process tuples from each batch in a single transaction * Start key - start iterating from the tuple with this index value * Iterator type - type of the iterator to use, as string or box.index constant * Process while function - function to call before checking each tuple, if it returns false, the task will stop until next full scan * And the ability to write custom iterator behavior for the user - `iterate_with` function which returns an iterator object which provides tuples to check Additional changes: Rewritten space iteration for tree index from select to pairs since the iterator is now stable Added some comments and removed duplicate code(expirationd_kill_task) Described new parameters in the code Co-authored-by: Nick Volynkin <nick.volynkin@gmail.com>
Description of the parameters of the new functionality Also added an example of how to use the new features for the user
Were added: * Iterations over Secondary Indexes - name of the index to iterate on * Transaction support - ability to process tuples from each batch in a single transaction * Start key - start iterating from the tuple with this index value * Iterator type - type of the iterator to use, as string or box.index constant * Process while function - function to call before checking each tuple, if it returns false, the task will stop until next full scan * And the ability to write custom iterator behavior for the user - `iterate_with` function which returns an iterator object which provides tuples to check Additional changes: Rewritten space iteration for tree index from select to pairs since the iterator is now stable Added some comments and removed duplicate code(expirationd_kill_task) Described new parameters in the code Co-authored-by: Nick Volynkin <nick.volynkin@gmail.com>
Description of the parameters of the new functionality Also added an example of how to use the new features for the user
Were added: * Iterations over Secondary Indexes - name of the index to iterate on * Transaction support - ability to process tuples from each batch in a single transaction * Start key - start iterating from the tuple with this index value * Iterator type - type of the iterator to use, as string or box.index constant * Process while function - function to call before checking each tuple, if it returns false, the task will stop until next full scan * And the ability to write custom iterator behavior for the user - `iterate_with` function which returns an iterator object which provides tuples to check Additional changes: Rewritten space iteration for tree index from select to pairs since the iterator is now stable Added some comments and removed duplicate code(expirationd_kill_task) Described new parameters in the code Closes: #50 Co-authored-by: Nick Volynkin <nick.volynkin@gmail.com>
The vinyl and memtx tree index have the same iteration logic using pairs. This is confirmed by the stability of iterators, see tarantool/doc#2102 Needed for: #50
Added the ability to iterate over any index by specifying the index name in options. The default is primary index. ci: installation, caching and running luatest PHONY added to Makefile as makefile target and luatests folder are the same. Needed for: #50
Added the ability from where to start the iterator and the type of the iterator itself. Start key can be set as a function (dynamic parameter) or just a static value. The type of the iterator can be specified either with the box.index.* constant, or with the name for example, 'EQ' or box.index.EQ Needed for: #50
For more flexible functionality, added the ability to create a custom iterator that will be created at the selected index (iterate_with). You can also pass a predicate that will stop the fullscan process, if required(process_while). Needed for: #50
One transaction per batch option. With task:kill, the batch with transactions will be finalized and only after that the fiber will complete its work Needed for: #50
Added an example of how to use the new features for the user. Description of how to run luatests. Run luatest via Makefile with tap tests Closes: #50
Remove expirationd_kill_task, duplicate of code. Comments, readme and responses to errors are presented in a more uniform form. Added additional comments for easier understanding of what is happening. Delete `...`, can't be jitted. Using outer double quotes only. Needed for: #50
The vinyl and memtx tree index have the same iteration logic using pairs. This is confirmed by the stability of iterators, see tarantool/doc#2102 Needed for: #50
Added the ability to iterate over any index by specifying the index name in options. The default is primary index. ci: installation, caching and running luatest PHONY added to Makefile as makefile target and luatests folder are the same. Needed for: #50
Added the ability from where to start the iterator and the type of the iterator itself. Start key can be set as a function (dynamic parameter) or just a static value. The type of the iterator can be specified either with the box.index.* constant, or with the name for example, 'EQ' or box.index.EQ Needed for: #50
For more flexible functionality, added the ability to create a custom iterator that will be created at the selected index (iterate_with). You can also pass a predicate that will stop the fullscan process, if required(process_while). Needed for: #50
One transaction per batch option. With task:kill, the batch with transactions will be finalized and only after that the fiber will complete its work Needed for: #50
Remove expirationd_kill_task, duplicate of code. Comments, readme and responses to errors are presented in a more uniform form. Added additional comments for easier understanding of what is happening. Delete `...`, can't be jitted. Using outer double quotes only. Needed for: #50
The vinyl and memtx tree index have the same iteration logic using pairs. This is confirmed by the stability of iterators, see tarantool/doc#2102 Needed for: #50
Added the ability to iterate over any index by specifying the index name in options. The default is primary index. ci: installation, caching and running luatest PHONY added to Makefile as makefile target and luatests folder are the same. Needed for: #50
Added the ability from where to start the iterator and the type of the iterator itself. Start key can be set as a function (dynamic parameter) or just a static value. The type of the iterator can be specified either with the box.index.* constant, or with the name for example, 'EQ' or box.index.EQ Needed for: #50
For more flexible functionality, added the ability to create a custom iterator that will be created at the selected index (iterate_with). You can also pass a predicate that will stop the fullscan process, if required(process_while). Needed for: #50
One transaction per batch option. With task:kill, the batch with transactions will be finalized and only after that the fiber will complete its work Needed for: #50
Added an example of how to use the new features for the user. Description of how to run luatests. Run luatest via Makefile with tap tests Closes: #50
Remove expirationd_kill_task, duplicate of code. Comments, readme and responses to errors are presented in a more uniform form. Added additional comments for easier understanding of what is happening. Delete `...`, can't be jitted. Using outer double quotes only. Needed for: tarantool#50
The vinyl and memtx tree index have the same iteration logic using pairs. This is confirmed by the stability of iterators, see tarantool/doc#2102 Needed for: tarantool#50
Added the ability to iterate over any index by specifying the index name in options. The default is primary index. ci: installation, caching and running luatest PHONY added to Makefile as makefile target and luatests folder are the same. Needed for: tarantool#50
Added the ability from where to start the iterator and the type of the iterator itself. Start key can be set as a function (dynamic parameter) or just a static value. The type of the iterator can be specified either with the box.index.* constant, or with the name for example, 'EQ' or box.index.EQ Needed for: tarantool#50
For more flexible functionality, added the ability to create a custom iterator that will be created at the selected index (iterate_with). You can also pass a predicate that will stop the fullscan process, if required(process_while). Needed for: tarantool#50
One transaction per batch option. With task:kill, the batch with transactions will be finalized and only after that the fiber will complete its work Needed for: tarantool#50
Added an example of how to use the new features for the user. Description of how to run luatests. Run luatest via Makefile with tap tests Closes: tarantool#50
Problem
We want to use secondary indexes to iterate over the space, as is done in indexpiration, but also use all expirationd features such as callback and hotreload.
Needs
Research
Iterations over Secondary
Now there is a hardcode implementation of iterations by tree or hash index zero, i.e. by primary:
expirationd/expirationd.lua
Lines 91 to 100 in 29d1a25
The logic in indexepiration is now this, some field is used to remove by time (only the time or time64 types are available):
If the time value in the field is greater than zero, then we walk along it in the index:
In expirationd we can use the specified index from opts and perhaps we need to specify from which element to start the iteration like:
Taking the field?
Do we need this feature at all or just specify the index?
we need to think about how we will accept the field for such cases:
Accordingly, the starting element needs to be considered from the architectural point of view, after understanding how we will take field or fields. And of course the starting element and ascending cannot be used in the HASH index
Transactions
One transaction per batch.
There are no problems if we take into account the transaction per batch. We also need to consider if our function worked stop_iteration, the transaction should be completed.
Pairs instead select in tree indexation
As noticed @olegrok #52 (comment) it's better to use pairs. For example now iterating over the hash index and done using pairs
expirationd/expirationd.lua
Lines 104 to 116 in 29d1a25
Proposed API
simple version
can use start_key instead of start_element?
flexible versions
Mons generator
Maybe we should take the union of implementations from above, the interface will be simpler without take_while:
The text was updated successfully, but these errors were encountered: