Skip to content
9 changes: 8 additions & 1 deletion pycardano/serialization.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,14 @@ def default_encoder(
# the output bytestring.
encoder.write(b"\x9f")
for item in value:
encoder.encode(item)
if isinstance(item, bytes) and len(item) > 64:
encoder.write(b"\x5f")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only activated when an item is inside a indefinite list. Do we need to break byte strings that are not part of indefinite list?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK we need to break all bytes that are longer than 64 bytes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have misunderstood, but it seemed to me that this was the best place to put it since all PlutusData are cast to IndefiniteList.

If I pull it out of the IndefiniteList block, will it be handled properly? I guess it should.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you (correctly) noticed that all PlutusData fields are part of an indefinite list. However plutusdata can also contain bytes without being part of PlutusData (i.e. pure bytes or bytes that are keys in dictionaries)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is the final answer to pull it outside of the IndefiniteList block?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this documentation it seems that yes, we need dummy classes. But not for lists, for bytes! :)

I am also wondering if there are cases where integers are incorrectly encoded (when they exceed 64 bytes size) since I implemented a special case for this here: https://github.com/OpShin/uplc/blob/448f634cc1225de6dd7390b670b01396d2e71156/uplc/ast.py#L430

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I am seeing more and more the intuition behind all the custom classes in OpShin.

I realize it's a bigger lift, but is there any reason why we wouldn't just take OpShin's implementation and pull it over to here? Then, just rely on pycardano rather than duplicating efforts across repos?

I apologize if I'm speaking out of ignorance and there are things I'm not considering, but this seems like it might be the more lasting implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries at all. The code I wrote for OpShin/UPLC was created after pycardano was written, hence there might be a point in copying it over. Then again, the UPLC implementation is really only catered towards PlutusData, while PyCardano also handles serialization of all other kinds of things - not sure if anything will break.

Long story short: The only reason that there are two different implementations is that no one yet tried to unify them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I would like to have this done sooner rather than later. Can I just create a dummy class for bytes to patch this and open a more general issue about syncing datum handling between OpShin and pycardano?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sounds good to me! Would also prefer to get this resolved over any big open stale PR :)

for i in range(0, len(item), 64):
imax = min(i + 64, len(item))
encoder.encode(item[i:imax])
encoder.write(b"\xff")
else:
encoder.encode(item)
encoder.write(b"\xff")
elif isinstance(value, RawCBOR):
encoder.write(value.cbor)
Expand Down
22 changes: 22 additions & 0 deletions test/pycardano/test_plutus.py
Original file line number Diff line number Diff line change
Expand Up @@ -396,3 +396,25 @@ class A(PlutusData):
assert (
res == res2
), "Same class has different default constructor id in two consecutive runs"


def test_plutus_data_long_bytes():
@dataclass
class A(PlutusData):
a: bytes

quote = (
"The line separating good and evil passes ... right through every human heart."
)

quote_hex = (
"d866821a51e835649f5f5840546865206c696e652073657061726174696e6720676f6f6420616e"
+ "64206576696c20706173736573202e2e2e207269676874207468726f7567682065766572794d"
+ "2068756d616e2068656172742effff"
)

A_tmp = A(quote.encode())

assert (
A_tmp.to_cbor_hex() == quote_hex
), "Long metadata bytestring is encoded incorrectly."