-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: stabilize Vector and improve docs #815
Conversation
let mut v1 = Vec1::new(b"m"); | ||
v1.extend([1u8, 2, 3, 4]); | ||
// Old collections serializes length as `u64` when new serializes as `u32`. | ||
assert!(Vector::<u8>::try_from_slice(&v1.try_to_vec().unwrap()).is_err()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel like it would be nice to have a convenient function to migrate from the old Vector collections to the new store one, but might be out of place, so leaving this one up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, yeah. It's definitely opinionated, and was thinking of doing it in a separate PR because of that, but I can do it now.
I'm thinking of just a function like:
impl Vector<T> {
fn to_v2(&self) -> crate::store::Vector { ... }
}
Reason being is that the data that is cloned is small, and most migration functions don't have owned self
for some reason (assuming because the SDK marks it as a view call, but unclear why that would matter).
@ChaoticTempest @itegulov do you have an intuition of what the API would look like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me. The whole structure is literally just length + prefix, so cloning does not seem to be a big issue to me.
most migration functions don't have owned self for some reason (assuming because the SDK marks it as a view call, but unclear why that would matter).
We had a conversation about this recently. Functions that take owned self are considered to be view and they can't write state back to storage, so they wouldn't be useful for migration anyway, no?
Unless I am misunderstanding something usually a user would version their contract like this:
enum Contract {
V0(StateV0),
V1(StateV1),
}
Where StateV0
uses old collections and StateV1
uses new collections. And then they would add a migration function that migrates all fields from StateV0
using one of the functions provided by us (.to_v2()
). Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's only if they structured their contract to be versioned from the start. If you don't, you would likely need to run a manual migration, otherwise, you wouldn't know which version of the object exists in storage if you tried to do ambiguously
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, but are we thinking of getting rid of the old collections eventually? If so, we would have to move this function elsewhere than collections::Vector and someone might forget to do so. So probably best to have it be apart of the new collections as well, like from_v1
So one other thing I'd like to be sure about is the API of the
Either we:
@matklad, if you have a second, seems like this might be a high-value but quick gut check for what might be better long-term. |
Merging now. Can change the above, if decided, before the next stable release (likely in a few weeks). Likely fine to stay as is though to keep parity with |
@@ -7,6 +7,7 @@ use std::marker::PhantomData; | |||
use borsh::{BorshDeserialize, BorshSerialize}; | |||
|
|||
use crate::collections::append_slice; | |||
use crate::store::IndexMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This import is unused except with the "unstable"
feature
use crate::store::IndexMap; | |
#[cfg(feature = "unstable")] | |
use crate::store::IndexMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI should probably deny warnings, wdyt? #895
//! Collections and types used when interacting with storage. | ||
//! | ||
//! This module is the updated version of [`near_sdk::collections`](crate::collections) where the | ||
//! data structures are more optimized and have a closer API to [`std::collections`]. | ||
//! | ||
//! These collections are more scalable versions of [`std::collections`] when used as contract | ||
//! state because it allows values to be lazily loaded and stored based on what is actually | ||
//! interacted with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth it to spend a bit more words on clarifying the difference between in-memory collections and such persistent collections. It's not immediately obvious why this needs needs to exist.
My take would be something like this:
Fundamentally, a contract's storage is a key/value store where both keys and values are just Vec<u8>
. If you want to store some structured data, for example, Vec<Account>
, one way to achieve that would be to serialize the Vec
to bytes and store that. This has a drawback in that accessing or modifying event a single element would require reading the whole vec from the storage.
That's where store
modules helps. It collections are backed by a key value store. For example, a store::Vector
is stored as several key-value pairs, where indices are the keys. So, accesing a single element would only load this specific element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is that paragraph which indicates the high-level difference, but I suppose I can add more detail here.
//! This module is the updated version of [`near_sdk::collections`](crate::collections) where the | ||
//! data structures are more optimized and have a closer API to [`std::collections`]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd avoid mentioning old code in the docs for the new code, except maybe as a footnote in the end. Instead, it's better to add migration docs to the old code.
len: u32, | ||
values: IndexMap<T>, | ||
pub(crate) len: u32, | ||
pub(crate) values: IndexMap<T>, | ||
} | ||
|
||
//? Manual implementations needed only because borsh derive is leaking field types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, wait, what's the semantics of BorshSerialize
? This only serializes the prefix, so that we can have nested collections, righth?
So, eg, if I want so return the sate of my contract as borsh for some contract call, if I just use serialize what I'd get would be the prefix (pointer), not the actual data? This is a bit surprising, worth documenting perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, the main reason is to avoid loading all elements into memory when deserializing state. But yeah, it seems worth it to document that more specifically to avoid returning one of these structures from a function.
Ideally, we wouldn't even have this implementation, but the SDK codegen assumes that the state implements Borsh(De)Serialize
to store and load state from storage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit surprising, worth documenting perhaps.
On the NEAR Con, one person was struggling with exactly they had a code like
#[derive(borsh)]
struct Thing {
components: Vec<Component>
}
#[near_bindgen]
impl Contract {
pub fn call_me(&self, thing: Thing)
}
They also have trouble figuring out how to make that use borsh or json, and ingeneral got entangled in two different serialization formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, that is very unrelated to this PR. Perhaps worth improving docs/visibility though. @itegulov wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, during NEARCon I have also seen a person who was confused about collections' (de)serialization semantics. Trying to generate an ABI actually helps to catch these kinds of errors as there is no JsonSchema
/BorshSchema
derivation for collections. We could perhaps somehow tailor the errors to communicate the reasoning (i.e. you are not supposed to use Vec
for your public API) to the users.
@@ -61,8 +115,8 @@ pub struct Vector<T> | |||
where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document what's the semantics of prefix
argument of the new
. Specifically, that the user should ensure that nothing else collides with this prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document why set
is needed, and when to use it over IndexMut
.
store
module and documents itLink to discussion about backwards compatibility of
Vector
https://near.zulipchat.com/#narrow/stream/308449-pagoda.2Fcore-tools/topic/collections.20backwards.20compatibility/near/282529069This will likely not be pulled in before
4.0.0
stable -- since more opinionated changes will likely be done in4.1
hopefully with all other store collection types