Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIP-0068 | Allow [* bounded_bytes] for images #809

Merged
merged 6 commits into from
Jun 2, 2024

Conversation

cardano-dev4
Copy link

In the Cardano node, Plutus Data bytes are limited to 64 bytes. There is an issue where IPFS V2 uris are 66 bytes so they require the use of an NFT.

This is already the case for CIP-25 NFTs so it should also be the case for CIP-68.

This pr adds: / [ * bounded_bytes ] to the uri pattern.

Copy link
Collaborator

@Crypto2099 Crypto2099 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a pretty straight forward and simple change to the standard. Given that it was always/already possible to pass an array value for a URI in CIP-25 I assume that implementers of CIP-68 have been implicitly supporting this functionality even though it was not explicitly defined in the CDDL.

CIP-0068/README.md Show resolved Hide resolved
@mmahut
Copy link
Contributor

mmahut commented May 6, 2024

As this is a breaking change to the standard, please make sure to mark it as a new version (or a different CIP).

@@ -191,7 +191,7 @@ metadata =
; A valid Uniform Resource Identifier (URI) as a UTF-8 encoded bytestring.
; The URI scheme must be one of `https` (HTTP), `ipfs` (IPFS), `ar` (Arweave) or `data` (on-chain).
; Data URLs (on-chain data) must comply to RFC2397.
uri = bounded_bytes ; UTF-8
uri = bounded_bytes / [ * bounded_bytes ] ; UTF-8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use bytes(*) instead of array(*) in the first place?

Copy link

@slowbackspace slowbackspace May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to @mmahut question:

Lucid does encode long strings as bytes(*) (using Data.to(new Constr(0, [Data.fromJson(metadata)...).

// excerpt from cbor.me of an image uri encoded as bytes(*) 5F tag
  5F                             # bytes(*)
            58 40                       # bytes(64)
               697066733A2F2F516D5543584D546376754A7077484633674142527236396365515232754547324673696B3943795768384D556F51766572796C6F6E67757269 # "ipfs://QmUCXMTcvuJpwHF3gABRr69ceQR2uEG2Fsik9CyWh8MUoQverylonguri"
            58 19                       # bytes(25)
               646566696E6974656C796D6F72657468616E36346279746573 # "definitelymorethan64bytes"
            FF                          # primitive(*)

Does the change proposed in this PR mean that array(*) should be use instead?

// excerpt from cbor.me of an image uri encoded as array(*) 9F tag
 9F                             # array(*)
            58 40                       # bytes(64)
               68747470733A2F2F6173736574317870743436647A336E39377979336C3937633035666A6C73727976357267677274306E6361372E6170652E6E667463646E2E # "https://asset1xpt46dz3n97yy3l97c05fjlsryv5rggrt0nca7.ape.nftcdn."
            58 40                       # bytes(64)
               696F2F696D6167653F73697A653D32353626746B3D506E307935746651766165654C6A4277764C646A506158684A322D445F735352664E4A6856386854336C6F # "io/image?size=256&tk=Pn0y5tfQvaeeLjBwvLdjPaXhJ2-D_sSRfNJhV8hT3lo"
            FF                          # primitive(*)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both bytes() and array() should be fine. In both instances it splits the values up into 64 byte chunks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few people including myself assumed this was already part of the standard since it works for CIP-25 as well and many dapps already support the array format

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not good at reading CDDL, does [ * bounded_bytes ] allows for both indefinite byte string (5F) and indefinite array (9F) tags?

Since ref implementation is using lucid, which implements this using indefinite byte string (5F), shouldn't that be the preffered way?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CDDL does not really have a concept of a "chunked, indefinite length string".

Essentially the expected behavior for a decoder would be to convert from an array into a concatenated string whether an indefinite byte string or an array of max-length 64 byte strings...

As Nick mentioned, this is how it has "always worked" since the early days of CIP-25 and my presumption is that most decoders are already accounting for this possible scenario. Adding it as an option to the spec I don't think warrants a version bump as I'm fairly confident we could find multiple instances of this already being used and supported even though not officially documented.

Copy link

@slowbackspace slowbackspace May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CDDL does not really have a concept of a "chunked, indefinite length string".

Ah, interesting and rather disappointing. Thanks for the insight.

Essentially the expected behavior for a decoder would be to convert from an array into a concatenated string whether an indefinite byte string or an array of max-length 64 byte strings...

As Nick mentioned, this is how it has "always worked" since the early days of CIP-25 and my presumption is that most decoders are already accounting for this possible scenario. Adding it as an option to the spec I don't think warrants a version bump as I'm fairly confident we could find multiple instances of this already being used and supported even though not officially documented.

What about other fields such as src and description. I would assume the same presumption applies?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly confident we could find multiple instances of this already being used and supported even though not officially documented (...)

The fact that various instance of this has been implemented doesn't mean we can break the standard.

The standard says the URI is of bounded_bytes and changing that breaks the standard.

For example, why Nick opened this PR in the first place is that Blockfrost honors this CIP (while other explorers do not). We are okay to implement this as CIP68v3.

Also modify Line # 436 to match the proposed new format for URI
Copy link
Collaborator

@Ryun1 Ryun1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I echo the concerns of @mmahut
This could break existing tooling, this change should be included in a version/CIP

I feel strongly we (@Crypto2099 @rphair) have to be strict with this.
Even if some tools may have been supporting this, that does not guarantee that all tools do.
So introducing a change like this (without version bump), would likely break some implementations.

@rphair
Copy link
Collaborator

rphair commented May 7, 2024

@mmahut #809 (comment): We are okay to implement this as CIP68v3.
@Ryun1 #809 (review): So introducing a change like this (without version bump), would likely break some implementations.

I agree the version bump would be essential. I've put this issue up for discussion at end of our next meeting (a week from today: https://hackmd.io/@cip-editors/88) so hopefully can poll some more implementors about what they've been doing & how they would respond to a version bump.

@rphair rphair added the Category: Tokens Proposals belonging to the 'Tokens' category. label May 7, 2024
@cardano-dev4
Copy link
Author

I echo the concerns of @mmahut This could break existing tooling, this change should be included in a version/CIP

I feel strongly we (@Crypto2099 @rphair) have to be strict with this. Even if some tools may have been supporting this, that does not guarantee that all tools do. So introducing a change like this (without version bump), would likely break some implementations.

Bumping the version for this makes sense to me I'm fine with that

@cardano-dev4 cardano-dev4 requested a review from Ryun1 May 8, 2024 00:43
Copy link
Collaborator

@Crypto2099 Crypto2099 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to version number have been made. Seems an otherwise straightforward modification to the standard.

Copy link
Collaborator

@rphair rphair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me after latest commits assigning the proposed changes to a Version 3. cc @mmahut

@slowbackspace
Copy link

slowbackspace commented May 8, 2024

Shouldn't the same change also be made to description field since CIP-25 also allows arrays there, or in this case, long strings are not supported by CIP68? 🤔

@SmaugPool
Copy link
Contributor

SmaugPool commented May 14, 2024

Note that SpaceBudz v2, one of the very first CIP-68 projects (with Matrix Berry) from CIP-68 co-author was already using bytes(*) and IPFS v2 URLs.

See for example asset1psxunemft82gkj0xa0ahu4z49w9j829pfvc9gz datum:

d8799fa5446e616d654e5370616365427564202333383330467472616974739f4a4368657374706c617465485265766f6c766572ff447479706546506172726f7445696d6167655f5840697066733a2f2f6261666b7265696262666b357272346f796d6f76623361336e6f616a767a767335796b6276746a613737366e6a663734657161726c77336564427934ff467368613235365820212abb18f1d863aa1d836d70135cd65dc28359a41fff9a92ff848022bb6c83c701d87980ff
$ echo -n d8799fa5446e616d654e5370616365427564202333383330467472616974739f4a4368657374706c617465485265766f6c766572ff447479706546506172726f7445696d6167655f5840697066733a2f2f6261666b7265696262666b357272346f796d6f76623361336e6f616a767a767335796b6276746a613737366e6a663734657161726c77336564427934ff467368613235365820212abb18f1d863aa1d836d70135cd65dc28359a41fff9a92ff848022bb6c83c701d87980ff | cbor-diag --to annotated
d8 79                                           # tag(121)
   9f                                           #   array(*)
      a5                                        #     map(5)
         44                                     #       bytes(4)
            6e616d65                            #         "name"
         4e                                     #       bytes(14)
            5370616365427564202333383330        #         "SpaceBud #3830"
         46                                     #       bytes(6)
            747261697473                        #         "traits"
         9f                                     #       array(*)
            4a                                  #         bytes(10)
               4368657374706c617465             #           "Chestplate"
            48                                  #         bytes(8)
               5265766f6c766572                 #           "Revolver"
            ff                                  #         break
         44                                     #       bytes(4)
            74797065                            #         "type"
         46                                     #       bytes(6)
            506172726f74                        #         "Parrot"
         45                                     #       bytes(5)
            696d616765                          #         "image"
         5f                                     #       bytes(*)
            58 40                               #         bytes(64)
               697066733a2f2f6261666b7265696262 #           "ipfs://bafkreibb"
               666b357272346f796d6f76623361336e #           "fk5rr4oymovb3a3n"
               6f616a767a767335796b6276746a6137 #           "oajvzvs5ykbvtja7"
               37366e6a663734657161726c77336564 #           "76njf74eqarlw3ed"
            42                                  #         bytes(2)
               7934                             #           "y4"
            ff                                  #         break
         46                                     #       bytes(6)
            736861323536                        #         "sha256"
         58 20                                  #       bytes(32)
            212abb18f1d863aa1d836d70135cd65d    #         "!*\xbb\x18\xf1\xd8c\xaa\x1d\x83mp\x13\\\xd6]"
            c28359a41fff9a92ff848022bb6c83c7    #         "\xc2\x83Y\xa4\x1f\xff\x9a\x92\xff\x84\x80\"\xbbl\x83\xc7"
      01                                        #     unsigned(1)
      d8 79                                     #     tag(121)
         80                                     #       array(0)
      ff                                        #     break

It's therefore very likely that most tools parsing CIP68 tokens already support at least bytes(*) and that it was @alessandrokonrad's intention to support it from the beginning.

Also changing the version to support bytes(*) would mean that SpaceBudz v2 is not compliant to the specification.

@mmahut
Copy link
Contributor

mmahut commented May 15, 2024

Correct me if I'm wrong, but I think bytes() a array() are two different concepts.

@alessandrokonrad
Copy link
Contributor

Note that SpaceBudz v2, one of the very first CIP-68 projects (with Matrix Berry) from CIP-68 co-author was already using bytes(*) and IPFS v2 URLs.

@SmaugPool this is correct. Definition of bounded_bytes is here btw: https://github.com/IntersectMBO/cardano-ledger/blob/a71a029e7a04bf6badbab69558f4376f05a9261b/eras/babbage/impl/cddl-files/extras.cddl#L38-L47

@SmaugPool
Copy link
Contributor

SmaugPool commented May 15, 2024

Correct me if I'm wrong, but I think bytes() a array() are two different concepts.

Indeed, and as bytes(*)already allows strings above 64 chars, I'm not sure what would be the benefit to also allow arrays in a new version. I think users would no longer know which one to use.

My understanding of bounded_bytes definition linked by Alessandro is that it already includes bytes(*), so maybe we could just improve the documentation instead to explain explicitly that bytes(*) can be used for bounded_bytes strings > 64 chars.

@Crypto2099
Copy link
Collaborator

My understanding of bounded_bytes definition linked by Alessandro is that it already includes bytes(*), so maybe we could just improve the documentation instead to explain explicitly that bytes(*) can be used for bounded_bytes strings > 64 chars.

I think this is not necessarily explicit because a lot of this is wrapped up in this comment block from the Ledger spec mentioned by Ales:

bounded_bytes = bytes .size (0..64)
; the real bounded_bytes does not have this limit. it instead has a different
; limit which cannot be expressed in CDDL.
; The limit is as follows:
; - bytes with a definite-length encoding are limited to size 0..64
; - for bytes with an indefinite-length CBOR encoding, each chunk is
; limited to size 0..64
; ( reminder: in CBOR, the indefinite-length encoding of bytestrings
; consists of a token #2.31 followed by a sequence of definite-length
; encoded bytestrings and a stop code )

It may be beneficial to these standards to specify any/all points that are currently defined as bounded_bytes as a new unbounded_bytes structure:

unbounded_bytes = #2.31(* bounded_bytes)

Or, if the intent is to clarify the existing standard to explain/show that these user-generated arrays are not necessary or beneficial, perhaps we should simply change the definitions to be: bytes(*) rather than bounded_bytes which is apparently where the confusion is being caused.

Copy link
Collaborator

@Ryun1 Ryun1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version has been incremented, good to see this merged.

@rphair rphair changed the title CIP-68 Images should allow [* bounded_bytes] CIP-0068 | Allow [* bounded_bytes] for images May 28, 2024
@rphair
Copy link
Collaborator

rphair commented Jun 2, 2024

Keeping in mind @Crypto2099 #809 (comment) about clarifications in the future, my understanding of the above is that the version bump resolves any potential difficulties & this mainly hasn't been merged due to time running out at CIP meetings.

@rphair rphair merged commit aef3538 into cardano-foundation:master Jun 2, 2024
@SmaugPool
Copy link
Contributor

SmaugPool commented Jun 5, 2024

I don't think the merged change makes any clearer that bytes(*) and therefore long strings are already allowed without a version bump, and I don't understand why adding [ * bounded_bytes ] as another way to have long strings is needed or helpful.

I find this change and version bump confusing overall.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Tokens Proposals belonging to the 'Tokens' category.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants