[smart_table] refine bucket_table to smart_table #6339

lightmark · 2023-01-26T07:34:12Z

Description

Personally I researched again around linear hashing, spiral storage and extendible hashing schemes. And still admit the best option for us is linear hashing. So I made a similar change to bucket_table to intelligently set the configurations such as
bucket_size and split_threshold.

Also, add two public functions to change these two values at any time as it does not have to be fixed after creation. Those methods give the users more flexibility to adjust their needs on the fly.

Test Plan

cargo test

davidiw

not to be a pain, can we do this in two commits? a rename and a rewrite?

aptos-move/move-examples/data_structures/sources/smart_table.move

movekevin · 2023-02-03T04:56:13Z

aptos-move/move-examples/data_structures/sources/bucket_table.move

        };
-        split(&mut map, initial_buckets - 1);
+        // The default number of initial buckets is 2.


Any reason for 2 specifically?

movekevin · 2023-02-03T05:02:19Z

aptos-move/move-examples/data_structures/sources/smart_table.move

@@ -0,0 +1,358 @@
+/// A smart table implementation based on linear hashing. (https://en.wikipedia.org/wiki/Linear_hashing)


Can you add documentation on how to create and use smart_table here? Specifically, I think a developer flow would be great

What do you mean by a developer flow? I suppose users will use it as a normal table except some configs that they can customize with new_with_config() which is commented above the new().

aptos-move/move-examples/data_structures/sources/smart_table.move

movekevin

Can you add more test coverage? There are only 3 tests here.

movekevin · 2023-02-03T05:15:23Z

aptos-move/move-examples/data_structures/sources/smart_table.move

+        let bucket = table_with_length::borrow(&table.buckets, index);
+        let i = 0;
+        let len = vector::length(bucket);
+        while (i < len) {


Here and in other places. You should use vector::any: https://github.com/aptos-labs/aptos-core/blob/main/aptos-move/framework/move-stdlib/sources/vector.move#L221. Here and in other places. Might also be better if this is done in a separate function since this code seems to be used in many functions here.

movekevin · 2023-02-03T05:16:05Z

aptos-move/move-examples/data_structures/sources/smart_table.move

+        assert!(table.level == 3, 0);
+        let i = 0;
+        while (i < 4) {
+            split_one_bucket(&mut table);


This is not a good test. It's calling a private function for splitting the buckets instead of using public functions to insert elements to trigger splitting.

I delete this one cuz it will be tested implicitly by others.

aptos-move/move-examples/data_structures/sources/smart_table.move

movekevin · 2023-02-03T05:19:33Z

aptos-move/move-examples/data_structures/sources/smart_table.move

+module aptos_std::smart_table {
+    use std::error;
+    use std::vector;
+    use aptos_std::aptos_hash::sip_hash_from_value;


Why sip hash specifically? What are the considerations here (collision, gas cost, etc.)?

good perf and prevention of hash flooding attack.

lightmark · 2023-02-03T08:31:41Z

Can you add more test coverage? There are only 3 tests here.

I already deleted one and replaced it with the other. Though they are three, but it covers almost all the cases since the edge cases are not as many as smart vectors. What else in you mind?

davidiw

I want to approve this, but we don't document the algorithm for this bucket at all. It seems like we rotate around which bucket we split. I know you didn't really change anything with the underlying algorithm, but maybe we can do better in this PR to make it clearer?

aptos-move/move-examples/data_structures/sources/smart_table.move

aptos-move/move-examples/data_structures/sources/bucket_table.move

aptos-move/move-examples/data_structures/sources/smart_table.move

movekevin · 2023-02-10T15:28:18Z

aptos-move/move-examples/data_structures/sources/smart_table.move

+        let bucket = table_with_length::borrow_mut(&mut table.buckets, index);
+        let i = 0;
+        let len = vector::length(bucket);
+        while (i < len) {


You can use vector::for_each_ref here. Can you check all while loops and see if you can replace with inline functions?

movekevin

A few high-level comments:

Why not move smart_table in the framework? We want it there so that many new modules can start relying on it (e.g. multisig account)
Can we consider adding iterable functionalities (as inline functions) such as map, for_each, etc.? These are expensive operations but there are definitely use cases for them and they'd be very useful especially when the table is small. We can easily support this with the buckets.
Can you consider whether we can add better support for reading data from smart table via API/CLI/SDK?

gerben-stavenga · 2023-02-10T18:07:04Z

aptos-move/move-examples/data_structures/sources/smart_table.move

+        let len = vector::length(bucket);
+        while (i < len) {
+            let entry = vector::borrow(bucket, i);
+            if (&entry.key == &key) {


entry has a hash value and you know the hash value at this point as well as it's computed on line 225. So why not first check hash == entry.hash which seems cheaper than checking the key, because you expect n/2 key checks. Where n is the # of items in a bucket.

225 is not the hash. But I will separate it out.

gerben-stavenga · 2023-02-10T18:10:47Z

aptos-move/move-examples/data_structures/sources/smart_table.move

+    }
+
+    /// Returns true iff `table` contains an entry for `key`.
+    public fun contains<K: drop, V>(table: &SmartTable<K, V>, key: K): bool {


What I'm missing here is a good way to do a lookup if it's found or return false if not. A lot of the time you'd like to do things like "lookup if key exists otherwise insert a new value". With this API you seem to be looking up keys spuriously. I think something like a signature of

Pseudo-code:
Lookup(table: &SmartTable, key: K): Optional<&V>

would be most useful

I don't disagree with you. But here we're trying to make API consistent with Table so ppl can use them interchangeably. Otherwise I would do the same as you suggested.

lightmark · 2023-02-13T23:53:53Z

A few high-level comments:

Why not move smart_table in the framework? We want it there so that many new modules can start relying on it (e.g. multisig account)

Can we consider adding iterable functionalities (as inline functions) such as map, for_each, etc.? These are expensive operations but there are definitely use cases for them and they'd be very useful especially when the table is small. We can easily support this with the buckets.

Can you consider whether we can add better support for reading data from smart table via API/CLI/SDK?

Would this be another PR after AIP is approved?
Sure.
That may needs a lot of support from SDK and CLI.

lightmark · 2023-02-14T18:47:21Z

aptos-move/move-examples/data_structures/sources/smart_table.move

+        #[test]
+        fun test_any() {
+            let t = make();
+            let r = any(&t, |_k, v| *v >= 99);


This line triggers:

thread 'test_data_structures' has overflowed its stack fatal runtime error: stack overflow

Do you have any idea what's wrong with the implementation of any?
@movekevin

test_map_ref, test_any and test_all do not work for the same reason.

You must use compiler/cli build with --release. Its not related to inline functions, they are just hitting existing problems in the functional programming design of the Move compiler.

I delete test_all for now cuz it always triggers a bug.

movekevin

Can you run gas benchmarking for smart table + vector?

movekevin · 2023-02-15T15:15:30Z

aptos-move/move-examples/data_structures/sources/smart_table.move

+        buckets: TableWithLength<u64, vector<Entry<K, V>>>,
+        num_buckets: u64,
+        // number of bits to represent num_buckets
+        level: u8,


Just double checking - this doesn't allow more than 256 (2^8) buckets. What's the rationale behind this?

Actually it is 2^{level <= 256}.

aptos-move/move-examples/data_structures/sources/smart_table.move

github-actions · 2023-02-21T22:12:22Z

✅ Forge suite `land_blocking` success on `4bae64cd32f71fb3018f275809f33ca57689ad9e`

performance benchmark with full nodes : 6191 TPS, 6415 ms latency, 11600 ms p99 latency,(!) expired 100 out of 2643700 txns
Test Ok

github-actions · 2023-02-21T22:14:41Z

✅ Forge suite `compat` success on `testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b` ==> `4bae64cd32f71fb3018f275809f33ca57689ad9e`

Compatibility test results for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> 4bae64cd32f71fb3018f275809f33ca57689ad9e (PR)
1. Check liveness of validators at old version: testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b
compatibility::simple-validator-upgrade::liveness-check : 7771 TPS, 4904 ms latency, 6400 ms p99 latency,no expired txns
2. Upgrading first Validator to new version: 4bae64cd32f71fb3018f275809f33ca57689ad9e
compatibility::simple-validator-upgrade::single-validator-upgrade : 4945 TPS, 8028 ms latency, 10400 ms p99 latency,no expired txns
3. Upgrading rest of first batch to new version: 4bae64cd32f71fb3018f275809f33ca57689ad9e
compatibility::simple-validator-upgrade::half-validator-upgrade : 4659 TPS, 8599 ms latency, 10900 ms p99 latency,no expired txns
4. upgrading second batch to new version: 4bae64cd32f71fb3018f275809f33ca57689ad9e
compatibility::simple-validator-upgrade::rest-validator-upgrade : 7128 TPS, 5400 ms latency, 8500 ms p99 latency,no expired txns
5. check swarm health
Compatibility test for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> 4bae64cd32f71fb3018f275809f33ca57689ad9e passed
Test Ok

lightmark added the enhancement New feature or request label Jan 26, 2023

lightmark requested review from davidiw, areshand, CapCap, zekun000 and movekevin January 26, 2023 07:34

lightmark force-pushed the smart_table branch 2 times, most recently from bf416f1 to 73b600c Compare January 26, 2023 08:20

lightmark mentioned this pull request Jan 28, 2023

[framework] remove unnecessary abilities #6138

Merged

lightmark force-pushed the smart_table branch from 73b600c to e8885cf Compare January 28, 2023 17:24

davidiw reviewed Jan 29, 2023

View reviewed changes

lightmark force-pushed the smart_table branch 2 times, most recently from 1784912 to 76a5044 Compare February 1, 2023 09:28

lightmark added 2 commits February 1, 2023 18:39

[smart_table] refine bucket_table to smart_table

c9e1a15

Rename bucket_table to smart_table

4c90298

lightmark force-pushed the smart_table branch from 76a5044 to 4c90298 Compare February 1, 2023 09:47

movekevin reviewed Feb 3, 2023

View reviewed changes

aptos-move/move-examples/data_structures/sources/smart_table.move Show resolved Hide resolved

movekevin reviewed Feb 3, 2023

View reviewed changes

lightmark force-pushed the smart_table branch from 93a7dbe to 441fac7 Compare February 3, 2023 08:04

davidiw reviewed Feb 6, 2023

View reviewed changes

gerben-stavenga reviewed Feb 10, 2023

View reviewed changes

aptos-move/move-examples/data_structures/sources/smart_table.move Show resolved Hide resolved

movekevin reviewed Feb 10, 2023

View reviewed changes

gerben-stavenga reviewed Feb 10, 2023

View reviewed changes

lightmark force-pushed the smart_table branch 2 times, most recently from f475ce8 to 7abaf00 Compare February 14, 2023 18:46

lightmark commented Feb 14, 2023

View reviewed changes

lightmark force-pushed the smart_table branch 2 times, most recently from d4a4dc9 to fe85039 Compare February 14, 2023 22:34

movekevin approved these changes Feb 15, 2023

View reviewed changes

[smart_table] add destroy<K: drop, V: drop>()

4bae64c

lightmark force-pushed the smart_table branch from fe85039 to 4bae64c Compare February 15, 2023 18:54

lightmark enabled auto-merge (squash) February 21, 2023 21:23

This comment has been minimized.

Sign in to view

davidiw approved these changes Feb 27, 2023

View reviewed changes

lightmark merged commit 0bae0ad into main Feb 27, 2023

lightmark deleted the smart_table branch February 27, 2023 16:14

lightmark mentioned this pull request Mar 8, 2023

[AIP-18][Discussion] Introducing SmartVector and SmartTable to aptos_std aptos-foundation/AIPs#82

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[smart_table] refine bucket_table to smart_table #6339

[smart_table] refine bucket_table to smart_table #6339

lightmark commented Jan 26, 2023

davidiw left a comment

movekevin Feb 3, 2023

movekevin Feb 3, 2023

lightmark Feb 3, 2023

movekevin left a comment

movekevin Feb 3, 2023

movekevin Feb 3, 2023

lightmark Feb 3, 2023

movekevin Feb 3, 2023

lightmark Feb 3, 2023

lightmark commented Feb 3, 2023

davidiw left a comment

movekevin Feb 10, 2023

movekevin left a comment

gerben-stavenga Feb 10, 2023

lightmark Feb 13, 2023

gerben-stavenga Feb 10, 2023

lightmark Feb 13, 2023

lightmark commented Feb 13, 2023 •

edited

Loading

lightmark Feb 14, 2023

lightmark Feb 14, 2023

wrwg Feb 14, 2023

lightmark Feb 14, 2023

movekevin left a comment

movekevin Feb 15, 2023

lightmark Feb 15, 2023

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Feb 21, 2023

github-actions bot commented Feb 21, 2023

		@@ -0,0 +1,358 @@
		/// A smart table implementation based on linear hashing. (https://en.wikipedia.org/wiki/Linear_hashing)

[smart_table] refine bucket_table to smart_table #6339

[smart_table] refine bucket_table to smart_table #6339

Conversation

lightmark commented Jan 26, 2023

Description

Test Plan

davidiw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

movekevin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lightmark commented Feb 3, 2023

davidiw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

movekevin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lightmark commented Feb 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

movekevin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Feb 21, 2023

✅ Forge suite land_blocking success on 4bae64cd32f71fb3018f275809f33ca57689ad9e

github-actions bot commented Feb 21, 2023

✅ Forge suite compat success on testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> 4bae64cd32f71fb3018f275809f33ca57689ad9e

lightmark commented Feb 13, 2023 •

edited

Loading

✅ Forge suite `land_blocking` success on `4bae64cd32f71fb3018f275809f33ca57689ad9e`

✅ Forge suite `compat` success on `testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b` ==> `4bae64cd32f71fb3018f275809f33ca57689ad9e`