Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Fix upper_bound bug in chain_plugin #6285

Merged
merged 7 commits into from
Nov 9, 2018
Merged

Conversation

arhag
Copy link
Contributor

@arhag arhag commented Nov 9, 2018

Change Description

Resolves #6273.

PR #6070 revealed an existing bug in the implementation of chain_plugin with regards to how it handles the upper_bound field in the get_table_rows and get_table_by_scope RPC calls.

The intended definition for this upper_bound field is similar to how it is defined in the std::upper_bound function in the C++ standard library: the first element within the specified ordered range with a value greater than the given comparison value.

Instead, the implementation in chain_plugin handled upper_bound as if the definition was the same as that of lower_bound: the first element within the specified ordered range with a value greater than or equal to the given comparison value.

The intended definition has the advantage of being consistent with the standard definition of upper_bound well known to C++ programmers and it also enables callers to request an equal_range by simply passing in the same comparison value for both the lower_bound and upper_bound fields. Attempting to do this with the old definition required knowing and calculating the value immediately succeeding the lower_bound value and passing that in as the upper_bound field (as PR #6070 did within cleos). Passing in the same comparison value for both the lower_bound and upper_bound using the old definition would result in no ranges ever returned.

The existing implementation of chain_plugin has a further bug (which this PR also fixes) in the case where upper_bound is less than lower_bound. The correct behavior in such a situation is to never return any results.

This PR fixes the bug in the case where upper_bound is less than lower_bound and it also changes the implementation of chain_plugin to handle the upper_bound field according to the intended definition rather than the existing definition. The implications of these changes to users of the chain API are described in detail under the "API Changes" section.

Furthermore, this PR adds back the additional checks in cleos for the system voteproducer approve, system voteproducer unapprove, multisig review sub-commands that were removed in #6070 and ensure the results returned from the chain API RPC are exactly what was requested. It also adds a similar check for the system bidnameinfo sub-command (similar to what was in #6068) to fix the bug described in #6067. These additional checks are necessary (even though they were not necessary in PR #6070) because this PR keeps the changes to the upper_bound value specified by cleos made in PR #6070 in which the upper_bound value is the value immediately succeeding the lower_bound value; that upper_bound workaround was kept just so that the version of cleos incorporating the changes in this PR can still behave correctly when talking to an old version of nodeos that has not incorporated the changes in this PR. (Also note that an old cleos at version v1.4.x or older would still behave correctly with the system voteproducer approve, system voteproducer unapprove, and multisig review sub-commands even when talking to a new version of nodeos that incorporates the changes in this PR. However, in such a situation, the bug described in #6067 would still of course apply when using system bidnameinfo.)

Consensus Changes

None

API Changes

Users of the get_table_rows and get_table_by_scope RPCs of the chain API are unaffected by these changes if they are not already using the upper_bound field. Existing users of the upper_bound field of those two RPCs should read on to understand the changes introduced by this PR.

get_table_rows

First some background on get_table_rows:

A particular index type (either the primary key type or one of the five secondary index types) is selected (whether implicitly or explicitly using the key_type field) in the get_table_rows call. The index type defines the type of the index key within table records to use to sort the table rows in the table for the purposes of this RPC.

A particular index ordinal is selected (whether implicitly or explicitly using the index_position field). This index ordinal must correspond to the selected index type (based on how the tables are structured in the persistent state on the blockchain), and the selected index ordinal along with the selected index type (together can be referred to as the selected index) determine how the tables rows in the table are ordered. For selected indices that are secondary indices, the sort order is based first on the index key (in ascending order) and then ties are broken by the primary key (in ascending order).

A particular table is specified by: the index type, index ordinal, and the tuple (code, scope, table). The rows in the table have an index key (of index type) which is the value used to sort the order of the table rows within the selected index.

If lower_bound is specified, let candidate_rows be the set of table rows in the specified table which have an index key greater than or equal to the value specified in lower_bound. Otherwise, let candidate_rows be the set of all table rows in the specified table.

Behavior prior to this PR:

If upper_bound is specified, let excluded_rows be the set of table rows in the specified table which have an index key greater than or equal to the value specified in upper_bound. Otherwise, let excluded_rows be the empty set.

If the value specified in upper_bound is less than the value specified in lower_bound, the behavior of get_tables_rows is undefined. Otherwise, get_table_rows returns the set of tables rows (in sorted order) that are in candidate_rows but not in excluded_rows.

Behavior as of this PR:

If upper_bound is specified, let excluded_rows be the set of table rows in the specified table which have an index key greater than the value specified in upper_bound. Otherwise, let excluded_rows be the empty set.

If the value specified in upper_bound is less than the value specified in lower_bound, get_table_rows returns no rows. Otherwise, get_table_rows returns the set of tables rows (in sorted order) that are in candidate_rows but not in excluded_rows.

Implications to existing users of upper_bound:

  • Users should not be calling get_table_rows with an upper_bound less than lower_bound. If they currently are doing so and using the returned results, they are relying on undefined behavior. This PR fixes that undefined behavior so that a logically consistent result is returned should users continue to make this inappropriate call.
  • Users of get_table_rows that call it with an upper_bound value greater than or equal to the lower_bound value may get back less rows with the new behavior than they received with the old behavior. If users adjust their call by setting upper_bound to the value immediately succeeding the value they provided originally, they will receive no less rows with the new behavior than they did before with the old behavior (they can then further filter on the client side).

get_table_by_scope

First some background on get_table_by_scope:

The tables within a given code account are sorted (for the purposes of get_table_by_scope) first by scope (in ascending order) and then ties are broken by table name (in ascending order).

If lower_bound is specified, let candidate_tables be the set of tables in the specified code account which have scope greater than or equal to the value specified in lower_bound. Otherwise, let candidate_tables be the set of all tables in the specified code account.

If the table is not specified in the RPC (or its value is 0), then let excluded_tables1 be the empty set. Otherwise, let excluded_tables1 be the set of tables in the specified code account which have a table name equal to the specified table value in the RPC.

Behavior prior to this PR:

If upper_bound is specified, let excluded_tables2 be the set of tables in the specified code account which have a scope greater than or equal to the value specified in upper_bound. Otherwise, let excluded_tables2 be the empty set.

If the value specified in upper_bound is less than the value specified in lower_bound, the behavior of get_table_by_scope is undefined. Otherwise, get_table_by_scope returns the set of tables (in sorted order) that are in candidate_tables but not in the union of excluded_tables1 and excluded_tables2.

Behavior as of this PR:

If upper_bound is specified, let excluded_tables2 be the set of tables in the specified code account which have a scope greater than the value specified in upper_bound. Otherwise, let excluded_tables2 be the empty set.

If the value specified in upper_bound is less than the value specified in lower_bound, get_table_by_scope returns no tables. Otherwise, get_table_by_scope returns the set of tables (in sorted order) that are in candidate_tables but not in the union of excluded_tables1 and excluded_tables2.

Implications to existing users of upper_bound:

  • Users should not be calling get_table_by_scope with an upper_bound less than lower_bound. If they currently are doing so and using the returned results, they are relying on undefined behavior. This PR fixes that undefined behavior so that a logically consistent result is returned should users continue to make this inappropriate call.
  • Users of get_table_by_scope that call it with an upper_bound value greater than or equal to the lower_bound value may get back less tables with the new behavior than they received with the old behavior. If users adjust their call by setting upper_bound to the value immediately succeeding the value they provided originally, they will receive no less tables with the new behavior than they did before with the old behavior (they can then further filter on the client side).

Documentation Additions

Documentation about the get_table_rows and get_table_by_scope RPCs of the chain API may need to be updated to reflect the new definition of upper_bound and the implications of that as discussed in the "API Changes" section above.

#6273

Still need to request larger range than necessary and do additional 
checks just so v1.5.0 of cleos can still work with nodeos versions older 
than v1.5.0.
@arhag arhag mentioned this pull request Nov 9, 2018
auto upper = secidx.lower_bound(boost::make_tuple(next_tid));
if( t_id != nullptr && index_t_id != nullptr ) {
using secondary_key_type = std::result_of_t<decltype(conv)(SecKeyType)>;
static_assert( std::is_same<decltype(IndexType::value_type::secondary_key), secondary_key_type>::value, "Return type of conv does not match type of secondary key for IndexType" );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't decltype(IndexType::value_type::secondary_key) == IndexType::value_type::secondary_key_type ? or was this just less characters because of the need for typename?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess I didn't notice the secondary_key_type typedef. Either way works.

@znebby
Copy link

znebby commented Nov 19, 2018

@arhag I am confused about this quote

Users of get_table_by_scope that call it with an upper_bound value greater than or equal to the lower_bound value may get back less tables with the new behavior than they received with the old behavior. If users adjust their call by setting upper_bound to the value immediately succeeding the value they provided originally, they will receive no less tables with the new behavior than they did before with the old behavior (they can then further filter on the client side).

Is it not the opposite? Since upper_bound is now exclusive, rather than inclusive. For example, if I was searching on a uint64 index and let's say I wanted all rows where the index = 0. Previously, I would set lower_bound to 0 and upper_bound to 1. With this new behaviour, if I understood correctly, I can keep both lower_bound and upper_bound to 0.

So, if I didn't change my code and I left lower to 0 and upper to 1, I would now receive all rows with the index value being 0 or 1. That is more rows, not less. Is there a misunderstanding somewhere?

@arhag
Copy link
Contributor Author

arhag commented Nov 19, 2018

@znebby:

Is it not the opposite? Since upper_bound is now exclusive, rather than inclusive. For example, if I was searching on a uint64 index and let's say I wanted all rows where the index = 0. Previously, I would set lower_bound to 0 and upper_bound to 1. With this new behaviour, if I understood correctly, I can keep both lower_bound and upper_bound to 0.

So, if I didn't change my code and I left lower to 0 and upper to 1, I would now receive all rows with the index value being 0 or 1. That is more rows, not less. Is there a misunderstanding somewhere?

You are understanding correctly. Perhaps my wording was just unclear.

What I was trying to say, continuing with your example, was that if your client was updated to now set both lower_bound and upper_bound to 0 in order to get all table rows with a key of 0, it would work fine if talking to the new nodeos, but would return back less results (in this case no results) if the same request was made to the old nodeos.

@znebby
Copy link

znebby commented Nov 19, 2018

Ah when talking to an old nodeos. Got it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants