Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PLT-8182] CIP-0087 support #5654

Merged
merged 25 commits into from
Jan 2, 2024
Merged

Conversation

kozross
Copy link
Contributor

@kozross kozross commented Nov 27, 2023

Pre-submit checklist:

  • Branch
    • Tests are provided (if possible)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
    • Changelog fragments have been written (if appropriate)
    • Relevant tickets are mentioned in commit messages
    • Formatting, PNG optimization, etc. are updated
  • PR
    • (For external contributions) Corresponding issue exists and is linked in the description
    • Targeting master unless this is a cherry-pick backport
    • Self-reviewed the diff
    • Useful pull request description
    • Reviewer requested

This adds support for the primitive operations described in CIP-0087. We also provide tests demonstrating both the properties, and examples, given for both primitive operations in the CIP-0087 document.

@kozross kozross force-pushed the koz/cip-0087 branch 4 times, most recently from 19630d5 to 15a5ceb Compare November 28, 2023 20:34
@kwxm
Copy link
Contributor

kwxm commented Dec 4, 2023

@kozross Have you used some unusual setting in the tests? If I run cabal test untyped-plutus-core-test inourmain branch it behaves normally and there are some places where I see things like Running 38% with the number being updated in place. When I do the same in this branch I get hundreds of pages of output that look like

unning: 21%
unning: 22%
unning: 22%
unning: 22%
unning: 23%
unning: 24%
unning: 24%
unning: 24%
unning: 26%
unning: 26%
unning: 26%
unning: 27%
unning: 28%
unning: 28%
unning: 28%
unning: 29%
unning: 29%
unning: 30%
unning: 30%
unning: 31%
unning: 32%
unning: 32%
unning: 32%
unning: 34%
unning: 34%
unning: 34%
unning: 35%
unning: 36%

Presumably my terminal's failing to interpret some escape sequences properly, but I'm at a loss as to why that might be happening.

@kozross
Copy link
Contributor Author

kozross commented Dec 5, 2023

None of my settings are unusual, and I certainly didn't see anything strange in my terminal. The only thing I did was add more tests: I didn't change anything about how they would get run.

@kwxm
Copy link
Contributor

kwxm commented Dec 5, 2023

Hmm. If I make the terminal more than 155 columns wide then it behaves normally, but any less than that and I get the weird behaviour I mentioned earlier. The final lines of output (on a wide terminal) look like this:

        ByteString -> Integer
          builtinByteStringToInteger b (builtinIntegerToByteString b d i) = i:                                                                 OK (24.10s)
              ✓ b2i_prop1 passed 8000 tests.
          builtinByteStringToInteger b (consByteString w8 emptyByteString) = w8:                                                               OK (22.97s)
              ✓ b2i_prop2 passed 8000 tests.
          CIP-0087 examples
            builtinByteStringToInteger False emptyByteString ==> failure:                                                                      OK
            builtinByteStringToInteger True emptyByteString ==> failure:                                                                       OK
            builtinByteStringToInteger False (consByteString 0x01 (consByteString 0x1 emptyByteString)) ==> 257:                               OK
            builtinByteStringToInteger True (consByteString 0x01 (consByteString 0x01 emptyByteString)) ==> 257:                               OK (0.01s)
            builtinByteStringToInteger True (consByteString 0x00 (consByteString 0x01 (consByteString 0x01 emptyByteString))) ==> 257:         OK
            builtinByteStringToInteger False (consByteString 0x00 (consByteString 0x01 (consByteString 0x01 emptyByteString))) ==> 65792:      OK
            builtinByteStringToInteger False (consByteString 0x01 (consByteString 0x01 (consByteString 0x00 emptyByteString) ==> 257:          OK
            builtinByteStringToInteger True (consByteString 0x01 (consByteString 0x01 (consByteString 0x00 emptyByteString) ==> 65792:         OK

I'm guessing that Tasty is aligning all of the output to accommodate these long lines at the end and for some reason if my terminal isn't wide enough then the output's incorrect. I think this must be some pecuilarity of the testing library and nothing to do with this PR.

Later: if I run our usual tests and narrow the window so that the output doesn't fit then I get similar results.

@kwxm
Copy link
Contributor

kwxm commented Dec 5, 2023

^ See FAQ 2 here.

Copy link
Contributor

@kwxm kwxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks pretty mergeable: there might be minor things that could do with a bit of tweaking, but the basic functionality all looks fine. I think we should wait for a review from at least one other person though: it'd be good to have someone looking at this who hasn't seen it before.


-- Conversions

-- | Convert a 'BuiltinInteger' into a 'BuiltinByteString', as described in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ease of reference it might be worth saying here that False = little endian and True = big endian, since that's quite hard to remember. I suppose there's nothing to stop us having separate builtinIntegerToByteStringLE and builtinIntegerToByteStringBE functions that fill in the endianness parameter for you, but that's probably out of scope for this PR and a user could easily define functions that use their preferred endianness anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to add this in, but the CIP actually spells out precisely what the arguments mean, so I'd just be repeating myself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but people don't follow links 🤷

plutus-tx/src/PlutusTx/Builtins.hs Outdated Show resolved Hide resolved
@kwxm kwxm requested review from michaelpj and zliu41 December 5, 2023 20:27
Copy link
Contributor

@michaelpj michaelpj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't reviewed the tests yet, but wanted to get something in! Generally things look pretty good.

pure EvaluationFailure
NotEnoughDigits -> do
emit "builtinIntegerToByteString: cannot represent Integer in given number of digits"
emit $ "Input: " <> (pack . show $ input)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

and performance is thus critical, we choose to use this manually-specialized form
for each combination of relevant arguments. While this is repetitive, and thus
also somewhat error-prone, the performance penalty for not doing this is
unacceptable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I am surprised you can't make this work. I'll try and take a think and see if I have any ideas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I more-or-less listed all the approaches I took: out of all of them, the manual specialization method turned out to be noticeably better, especially for smaller inputs. This was quite surprising to me as well, but I can't really argue with the benchmark numbers.

free, as there is no data processing to be done: all we need to do is copy
from one place to another, essentially.

This technique only benefits us because counted arrays are cache-friendly: see
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this was borne out in benchmarks and isn't just a priori? (a priori argument is convincing though)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this definitely matters a lot. My original version of both functions didn't loop section at all, and I actually got more than a factor-of-8 speedup when I did the sectioning, as I reduced the number of integer operations by a factor of 8 (and they're linear), as well as the number of copy operations that'd be required (and they're also linear).

This is especially important, as I can't (easily) access the representation of Integers directly while being compatible with GHC 8.10. If and when 8.10 gets jettisoned, we can avoid all of this and just directly copy memory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it sounds like you've already written the fast implementation, should we include it behind some CPP so it's ready to go later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this with @kwxm, and decided against it, as it would mean that different nodes would run the same program with quite different costings. Due to the approach I'm taking here (to be 8.10 compatible), the costing is quadratic (Integer operation linearity forces our hand), but a direct copying method is linear.

Unless you mean just including it in the codebase behind an effective 'do not compile this'-CPP?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry yes, I meant the latter. It can be some dead code that we can use later. It needs the CPP only because it presumably won't compile on 8.10.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't compile as-is on 9.2 either. Furthermore, the CIP has changed in a significant-enough way since that the implementation would need rethinking anyway. I don't think there's much reason to include my prior work as-currently-is: might be worth coming back to this question once 8.10 is gone.

| ix <= (limit - 7) =
let digitGroup = read64LE ix
newShift = shift + 64
newIx = ix + 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shift is always ix * 8, right? I guess the correspondence isn't so simple in the BE case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we adjust it in two different ways, due to loop sectioning.

let digitGroup = read64LE ix
newShift = shift + 64
newIx = ix + 8
in if digitGroup == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this optimization matter in practice? I wouldn't be shocked if it was even faster to avoid the test and branch and just unconditionally do the shift and addition

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some benchmarks to verify this. In short, it doesn't really matter much: 10% on microbenchmarks like this is almost indistinguishable from noise, and furthermore, there's no evidence of linear growth of speedup or slowdown, which is what we would expect to see if the branching was costing us.

Just to be absolutely sure, I wrote another benchmark against the best possible case for this optimization: where you have a lot of significant zeroes sandwiched between two significant non-zeroes. However, even there, there's no major conclusive advantage. Given that less code is better than more code, I'll eliminate this branch everywhere it appears.


-- Conversions

-- | Convert a 'BuiltinInteger' into a 'BuiltinByteString', as described in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but people don't follow links 🤷

@michaelpj
Copy link
Contributor

I have an ugly suggestion which I think you will both hate.

At the moment the padding argument is a little awkward:

  1. It can be very big
  2. It's awkward to cost because the cost depends on the value not the size

What if... the padding argument was instead another integer, and the meaning was to pad the result to be at least as wide as that integer. That would I think simplify the problems (it could still be too big, but in a more normal way). But it would be more complicated to implement, and probably bad to use (although the constant-folder should prevent you from paying to compute 2^64).

@kozross
Copy link
Contributor Author

kozross commented Dec 11, 2023

@michaelpj - this is both awkward to implement and awkward to use. On the implementation side, I would now have to tear down two Integers, in parallel, which effectively doubles the work. On the user side, you now have to figure out a constant of exactly the right byte width, plus the semantics are quite confusing.

If our goal is to avoid wraparound, we could just fail out on negative arguments altogether. Large positive Integers truncate, and those would definitely be ruled out by costing. I initially didn't do this (mostly to avoid having yet more error conditions), but the more I look at it, the more it seems to make sense to do this.

@michaelpj
Copy link
Contributor

If our goal is to avoid wraparound, we could just fail out on negative arguments altogether. Large positive Integers truncate, and those would definitely be ruled out by costing.

I'm happy with rejecting negative arguments, and maybe also large positive arguments that don't fit into an Int.

@kozross
Copy link
Contributor Author

kozross commented Dec 11, 2023

If we go with rejecting arguments outside of the non-negative limits of Int, that guarantees us 2^29-1. That's surely big enough for anyone: a ByteString of that size is half a gigabyte, which is well in excess of anything that could possibly fit.

Given this, I feel it's the right thing to do. I will amend the CIP with this in mind, and update the PR accordingly.

Copy link
Contributor

@michaelpj michaelpj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff, I think we should merge this soon. Just one question about a magic number.

| paddingArg > 10240 = do
integerToByteStringWrapper endiannessArg lengthArg input
-- Check that we are within the Int range on the non-negative side.
| lengthArg < 0 || lengthArg >= 536870912 = do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we compute this? I'm not actually sure where this number comes from.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2^29. It's the positive bound on Int guaranteed by GHC.

@drospa drospa changed the title CIP-0087 support [PLT-8182] CIP-0087 support Dec 27, 2023
@drospa
Copy link

drospa commented Dec 29, 2023

@kozross There is a wrong CIP reference. It shall be CIP-0058 . That the one that introduces the BuiltinByteString primitives

@kozross
Copy link
Contributor Author

kozross commented Dec 31, 2023

@drospa - if you read the CIP I reference, you will find that it specifically explains why 0058's solution is not a good solution. Referencing it here makes no sense.

@michaelpj
Copy link
Contributor

:shipit:

@michaelpj michaelpj merged commit f734591 into IntersectMBO:master Jan 2, 2024
5 checks passed
@kwxm
Copy link
Contributor

kwxm commented Jan 2, 2024

@drospa - if you read the CIP I reference, you will find that it specifically explains why 0058's solution is not a good solution. Referencing it here makes no sense.

I think that @drospa is suggesting that the CIP (CIP-0087) mentioned in the PR title is the wrong one. I'm a bit confused about this myself. As far as I can see there is no CIP-0087, and CIP-0058 is the relevant one. Is that correct?

The implementation here does differ from what CIP-0058 says, so we should presumably update that at some point to reflect what's actually the case (and we should seek "official" approval for the CIP as well, but I'm not how to initiate that the process).

@kozross
Copy link
Contributor Author

kozross commented Jan 3, 2024

@kwxm - this is a bit of a sync issue. Originally, 0087 was the number I gave to the CIP PR that this work is based on. I was told later that CIP numbers are assigned post-merge, rather than on-PR: thus, I had to rename it to CIP-XXX temporarily. However, the response to the CIP PR took a while, and therefore, I didn't update the title of this PR.

It shouldn't matter much anyway, as all references to the CIP in the documentation aim at the correct document, though they will need updating once the CIP PR merges.

@kwxm kwxm mentioned this pull request Jan 19, 2024
@kwxm kwxm mentioned this pull request Jun 9, 2024
58 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants