Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce an ERC-1155 _exists() function #2185

Closed
wants to merge 2 commits into from

Conversation

KaiRo-at
Copy link
Contributor

This PR is based on a discussion started at #2003 (comment) and should give the ERC-1155 implementation more compatibility to OpenZeppelin's ERC-721 implementation and also better compatibility to services like OpenSea as pointed to by https://twitter.com/xanderatallah/status/1232124941425881089

The main thing introduced here is an _exists() function that is checked in a few places where we'd otherwise return a null-ish default value (0 or ''), and to know if a token ID is valid, it has to be registered internally first - either via minting a first token on that ID, or by calling an explicit function.

Tests will fail on this PR for the moment as it bases on both #2130 and #2029 - also, I have not written any tests for this PR itself yet as it's mostly a suggestion on how we could do this in a good way.

@KaiRo-at KaiRo-at changed the title Erc1155 exists Introduce an ERC-1155 _exists() function Apr 14, 2020
@KaiRo-at
Copy link
Contributor Author

One thing I somewhat wonder about is if requiring this explicit registration would cause any issues for people who want to use the ERC-1155 implementation from OpenZeppelin.
FWIW, for the project I work on right now, I will use this patch as it should work really well with it.

@christopheradams
Copy link

I have some reservations about how this matches up with the ERC-1155 spec.

The section on Enumerating from events says that "In order to keep storage requirements light for contracts implementing ERC-1155, enumeration (discovering the IDs and values of tokens) must be done using event logs." Baking the token registration storage mapping into the default contact would seem to violate this principle.

Also, the Metadata Extensions reads "The uri function MUST NOT be used to check for the existence of a token as it is possible for an implementation to return a valid string even if the token does not exist." I'm afraid clients might intentionally or unintentionally treat the presence or absence of an error generated by uri() as a de facto check. Reading the spec, I'm not sure I would expect any of the view functions to ever return errors.

@KaiRo-at
Copy link
Contributor Author

Hrm, that sounds like it's an unusable standard for most usages then as you only get reliable event logs with an archive node which is really expensive to set up and operate, and services like OpenSea need a good way to check which tokens are valid. Our implementation of the project in use right now will use the patch in this PR and then I probably will advocate for people not using ERC-1155 because of its multiple shortcomings, including those two quotes from the spec.

@christopheradams
Copy link

Just for the record, I believe there are valid use cases for an _exists() function and a mapping to keep track of it. For example, you need this if you want to guarantee that a token can only be minted once. I'm in favor of seeing a good implementation of this, but I don't think it should be the default (requiring contract storage it may not need), unless the spec changes.

@nventuro
Copy link
Contributor

nventuro commented May 8, 2020

Thanks for thos comments @christopheradams! Here's my interpretation of the spec:

The uri function MUST NOT be used to check for the existence of a token as it is possible for an implementation to return a valid string even if the token does not exist.

I think this means callers of uri() should not use it to check existence: that is, the fact that uri(id) returns with no errors doesn't necessarily mean that id exists. Nothing prevents us from adding a check ourselves if we add a notion of a token 'existing'.

In order to keep storage requirements light for contracts implementing ERC-1155, enumeration (discovering the IDs and values of tokens) must be done using event logs.

Again, the specification doesn't forbid implementations from having storage-based enumeration (which would be silly) - rather, it says that enumerating using events only should be possible. I share @KaiRo-at's concerns when it comes to building state from events: both access to archival nodes and building state-aggregating software are not trivial tasks.

I believe on-chain enumeration (sensibly applied) is a good practice, which is why we started introducing concepts such as EnumerableMap to the library, and use it for AccessControl. In that sense, we may even want to go as far as implementing this using EnumerableSet, which would let users not only check for existence but also enumerate all tokens in the system.

Copy link
Contributor

@nventuro nventuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once #2029 is merged, we should rebase this PR to keep the diff to a minimum.

* @param id uint256 ID of the token to register
*/
function _registerToken(uint256 id) internal virtual {
_tokenExists[id] = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With EIP2200, an SSTORE here will cost the same as an SLOAD if _tokenExists[id] is already set to true, so this is already optimal in that sense.

We may want to consider using a 32-byte word instead of a boolean however: see the comments here (tl;dr: storing booleans is more costly than full-words because of extra checks placed by the compiler).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, from the 1155 EIP:

To broadcast the existence of a token ID with no initial balance, the contract SHOULD emit the TransferSingle event from 0x0 to 0x0, with the token creator as _operator, and a _value of 0.

What do you think about emitting such an event here? We should only do it the first time though, so we'd want to introduce an if statement anyway, even if not required for efficiency reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that is an interesting piece from the spec, I missed that one! I think it definitely would be a good idea to emit that one.
Also, I didn't even think of the double-registering case, thanks for your thoughts on that.
So, should I change _tokenExists to a uint256 (using 0 and 1 for false and true) instead? I find it somewhat sad that a bool can end up being more expensive than a 32-byte type...

Copy link
Contributor

@nventuro nventuro May 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm also not thrilled by that - we run into the same situation on this other PR.

What do you think about using EnumerableSet to store this data, instead of a plain mapping and having to deal with uint vs bool? As mentioned in the other comments, being able to enumerate this data is both important and hard to do via events. It'd mimic 721's tokenByIndex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, an EnumerableSet mostly makes sense for the reverse indexing, which from what I see is completely unneded here and only costs additional gas. Is there any reason for having this really enumerable? Also, it would not change anything wrt bool vs. uint as EnumerableSet only works on bytes32 values, potentially in the variant of UintSet where they are cast to uint256 anyhow.

@AC0DEM0NK3Y
Copy link

#2185 (comment)

Hi. On the above comment, forgive me if I'm wrong it's been a while since I looked at a node but from what I recall you do not need a archive node to get full history of events you need an archive node if you need full history of the state.
You would need to wait for the events history to be synced also down to the right block on a normal/fast node to get full history for a contract (there are framework functions to query this) but this is way way less time and space than a full/archive node.

So, pulling all balance or uri events is fine from a normal node. Indeed this should be the case if you want to be able to track balances this way too which is way faster than having to hit the node to query.

IIRC the desire to not have something enforced on that front was not to put a gas burden on implementations as it can be inferred from a balance transfer event (either with a "create" style one or an actual transfer) and implementations such as TheSandbox and SkyWeaver did not want to pay this cost.

@nventuro
Copy link
Contributor

nventuro commented May 8, 2020

Thanks for that comment @AC0DEM0NK3Y, it seems I was mistaken when it comes to node types - I believed logs were pruned after some time on regular nodes. Indeed, Infura seems to store all events on a traditional database for fast access via the eth_getLogs RPC call.

However, there seems to be many iniciatives to drop this expectation of nodes to keep logs forever. Logs are meant to be used for applications to react to activity on the blockchain, not as a form of inexpensive read-only off-chain storage. See this great gist from one of Geth lead devs on that topic.

All in all, I think this topic will become an issue in the near future and we should prepare for that. Additionally, my point about state-aggregation still stands.

@AC0DEM0NK3Y
Copy link

AC0DEM0NK3Y commented May 8, 2020

Yeah I could see the desire to prune, although I would hope they would keep a mode on that at least allows you to say "sync the history for X,Y and Z contract from block number" it's something I put as a suggestion on parity features a while ago... something like that would also be desirable for state.

Personally I think what most serious implentations would do is sync your node event history down once, fill a db with events up to head and then just feed the db as events come in to give balance and uri updates (and flag existance).
Hitting the node for every request seems like it will get quite unwieldy.

@KaiRo-at
Copy link
Contributor Author

KaiRo-at commented May 8, 2020

@nventuro Ah, thanks for the explanations on the spec, I let myself be misled about the interpretation as well after the comment here but re-reading the statements, i agree with you.
On the event log issue, I has some expectations that event logs may be available on fast-pruning nodes but learned the hard way that they are not - while they are not pruned at least by current Parity/OpenEthereum implementations, you only get the logs for any blocks the node has processed itself, so none from before it loaded a snapshot, and none if it may lag and jump to another snapshot (which is rare but can happen during sync or if it was offline for some reason. Even worse, you just get empty responses and no errors if you query for blocks it missed logs for. So we learned you need an archive node to be sure you have a complete event log. Of course, services like Infura or Etherscan probably operate archive nodes and also may copy the data they need into other databases they can query more efficiently.
I'll look into the comments on the code in the next days, I had an intense day and want to be able to concentrate fully when looking at those.

@AC0DEM0NK3Y
Copy link

AC0DEM0NK3Y commented May 8, 2020

You have to wait until it has synced the events history to be sure. Being "fully synced" to head block after catchup from a snapshot is not actually fully synced, you are only guaranteed to have events data from snapshot block at that point after which it will switch to history pull along with keeping up with head.
Both parity and geth aren't very good at signalling this, iirc parity had a single number that you'll see increasing in the log that shows where its currently up to.

To be sure you either have to eye/parse out from the log or you can call a function on the node to query where it is up to. Web3 last time I looked at it didn't expose this well either (but there was an easy workaround to expose the function via an extension if memory serves) but it was a simple call with ethers iirc.

@stale
Copy link

stale bot commented May 29, 2020

Hi all!
This Pull Request has not had any recent activity, is it still relevant? If so, what is blocking it? Is there anything we can do to help move it forward?
Thanks!

@stale stale bot added stale and removed stale labels May 29, 2020
@nventuro
Copy link
Contributor

nventuro commented Jun 3, 2020

We intend to release v3.1 of Contracts soon, with initial support for ERC1155. Given that _exists() is not part of the standard, and we don't yet have consensus on some of the low level details, we'll leave this improvement for v3.2.

@stale
Copy link

stale bot commented Jun 18, 2020

Hi all!
This Pull Request has not had any recent activity, is it still relevant? If so, what is blocking it? Is there anything we can do to help move it forward?
Thanks!

@stale stale bot added the stale label Jun 18, 2020
@KaiRo-at
Copy link
Contributor Author

Mr. Bot, this is still relevant, I'll look into it after the releases of OpenZeppelin 3.1 and Crypto stamp 2.

@nventuro nventuro removed the stale label Jun 19, 2020
@stale
Copy link

stale bot commented Jul 4, 2020

Hi all!
This Pull Request has not had any recent activity, is it still relevant? If so, what is blocking it? Is there anything we can do to help move it forward?
Thanks!

@stale stale bot added the stale label Jul 4, 2020
@KaiRo-at
Copy link
Contributor Author

KaiRo-at commented Jul 5, 2020

I'll come back to this soon, my dear bot.

@stale stale bot removed the stale label Jul 5, 2020
@frangio
Copy link
Contributor

frangio commented Jul 8, 2020

There seem to be multiple things being discussed here and I think we need to define more clearly what we want and why.

  1. Adding an exists or _exists function.
  2. Whether the view functions should revert unless a token exists.
  3. Whether there should be enumerability.

And I'd add whether any of this should be enabled by default in our ERC1155.

Another option would be to make sure exists can be implemented by deriving ERC1155 (e.g. using hooks) and that it does not require forking.


Based on the tweet by OpenSea, I think adding an exists function makes sense. I wonder how much gas overhead it would introduce, but it sounds like most of it would only be incurred during minting, which I think would be acceptable.

Based on the concerns expressed by the EIP document and in the comments in this issue, I'd say the view functions should not revert.

I would oppose enumerability by default, but I would consider it as an optional extension. This whole discussion about enumerability on-chain versus doing it based on events is something that we are often unsure about, though, and I think we need to have a deeper discussion about it.

@nventuro
Copy link
Contributor

nventuro commented Jul 9, 2020

Based on the concerns expressed by the EIP document and in the comments in this issue, I'd say the view functions should not revert.

Which functions are these? balanceOf and isApprovedForAll?

@frangio
Copy link
Contributor

frangio commented Jul 9, 2020

And I think also url().

@stale
Copy link

stale bot commented Jul 25, 2020

Hi all!
This Pull Request has not had any recent activity, is it still relevant? If so, what is blocking it? Is there anything we can do to help move it forward?
Thanks!

@stale stale bot added the stale label Jul 25, 2020
@KaiRo-at
Copy link
Contributor Author

I'm actually starting to look into this again now and hope to have a new PR this week.

@stale stale bot removed the stale label Jul 29, 2020
KaiRo-at added 2 commits July 29, 2020 16:55
…ntation and satisfy https://twitter.com/xanderatallah/status/1232124941425881089

Use an explicit registration internally to mark token IDs as existing

per comments from nventuro, add an event to flag the existence of this token ID
@KaiRo-at
Copy link
Contributor Author

OK, I rebased to current master and updated the work somewhat, no tests yet, but would be great to know if that is the right direction to go now.

@KaiRo-at KaiRo-at changed the base branch from feature-erc1155 to master July 29, 2020 15:03
@stale
Copy link

stale bot commented Aug 14, 2020

Hi all!
This Pull Request has not had any recent activity, is it still relevant? If so, what is blocking it? Is there anything we can do to help move it forward?
Thanks!

@stale stale bot added the stale label Aug 14, 2020
@KaiRo-at
Copy link
Contributor Author

I'm waiting for feedback on the current approach and for time to work on tests.

@stale stale bot removed the stale label Aug 14, 2020
Copy link
Contributor

@frangio frangio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KaiRo-at It's looking pretty good I think. I like the direction, it's a very simple addition.

@@ -31,6 +31,9 @@ contract ERC1155 is Context, ERC165, IERC1155, IERC1155MetadataURI {
// Used as the URI for all token types by relying on ID substition, e.g. https://token-cdn-domain/{id}.json
string private _uri;

// Mapping token ID to that token being registered as existing (1 for existing, 0 for not existing)
mapping (uint256 => uint256) private _tokenExists;
Copy link
Contributor

@frangio frangio Aug 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this uint256 => uint256 instead of uint256 => bool?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because of #2185 (comment) - do you think that's a bad idea?

@frangio
Copy link
Contributor

frangio commented Aug 19, 2020

What do you think about making exists public?


There is a potential issue related to upgradeable ERC1155 contracts. If contract is upgraded from the current ERC1155 version to the one in this PR, exists will return incorrect values because the mapping will not be populated with the already existing contracts. What do you think about that? Is there anything that could be done, other than exposing an external function to allow already existing tokens to be registered?

@stale

This comment has been minimized.

@stale stale bot added the stale label Sep 4, 2020
@stale stale bot removed the stale label Sep 16, 2020
@KaiRo-at
Copy link
Contributor Author

What do you think about making exists public?

I mainly followed the pattern we also have in the ERC721 OpenZeppelin contract, leaving it to implementers to expose it publicly. But I'm open to making it public right away.

There is a potential issue related to upgradeable ERC1155 contracts. If contract is upgraded from the current ERC1155 version to the one in this PR, exists will return incorrect values because the mapping will not be populated with the already existing contracts. What do you think about that? Is there anything that could be done, other than exposing an external function to allow already existing tokens to be registered?

I haven't thought about upgradeable contracts before as I usually don't do them, but yes, in this case, I think, yes, they would need to call _registerToken(uint256 id) with the IDs that have been created before.

@stale

This comment has been minimized.

@stale stale bot added the stale label Oct 2, 2020
@stale
Copy link

stale bot commented Oct 17, 2020

Hi folks!
This Pull Request is being closed as there was no response to the previous prompt. However, please leave a comment whenever you're ready to resume, so it can be reopened.
Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants