Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitwise costing (PLT-8790) #5733

Merged
merged 86 commits into from
Jan 26, 2024
Merged

Bitwise costing (PLT-8790) #5733

merged 86 commits into from
Jan 26, 2024

Conversation

kwxm
Copy link
Contributor

@kwxm kwxm commented Jan 19, 2024

This extends the costing infrastructure to handle the ByteStringToInteger and IntegerToByteString functions added in #5654. The CPU usage of these is quadratic at the moment, and that required quite a few additions. Once we no longer he tavo support GHC 8.10 it's hoped that we can have a linear-time implementation instead: this won't require any further changes to the costing infrastructure since a linear function can be represented by a quadratic function with a zero coefficient (or more likely a very small coefficient) in degree 2. Also, IntegerToByteString will currently fail if the length of the output bytestring would be greater than 8000 (accurate costing becomes difficult beyond this point). We may remove that limit when the implementation is updated, and if we do that will require a new semantic variant.

IntegerToByteString takes a size argument that says how long the output should be, and that required another small extension so that the value of that argument is treated as a literal size for memory costing purposes (ie we don't take the memory usage to be the size of that argument, but the argument itself).

I've added costing benchmarks and run them, and added the resulting data to builtinCostModel.json. The benchmarks may still need a little tuning, but that can be done later.

I also fixed a few small things that I ran into related to plugin integration.

@kwxm
Copy link
Contributor Author

kwxm commented Jan 23, 2024

@michaelpj @effectfully I've now moved the size checks out of Default.Builtins and back into Bitwise.Conversions and added some tests, so you might want to take another look.

@kwxm
Copy link
Contributor Author

kwxm commented Jan 23, 2024

The maximum size for outputs of integerToByteString is 8000, but maybe 8192 (=8K) would be better.

integerToByteString in Plutus Core so that we can continue to support the
current behaviour for old scripts.-}
integerToByteStringMaximumOutputLength :: Integer
integerToByteStringMaximumOutputLength = 8000
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe 8192 would be better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer that indeed, because 8000 feels entirely arbitrary and 8K somehow feels less arbitrary, but it's again my craziness is talking. 8192 also gives the reader a hint of what this is about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer that indeed, because 8000 feels entirely arbitrary

Oh well, I suppose so. I'll have to change a lot of the conformance tests now too [goes off grumbling].

Copy link
Contributor

@effectfully effectfully left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great tests.

Comment on lines 66 to 67
-- Int (so it should be at most 2^29, which is the largest bound that GHC
-- guarantees).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big kudos for that precision. But I think it doesn't apply to us, because we simply refuse to build on a non-64 machine. @michaelpj is that right? I'm talking about #if WORD_SIZE_IN_BITS == 64 in Universe.hs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think we need to worry about 32bit platforms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've removed that and just said it should fit into an Int.

has integerLog2 but only in GHC >= 9.0. We should use the library function
instead when we stop supporting 8.10. -}
integerLog2 :: Integer -> Int
integerLog2 !i = I# (integerLog2# i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what's the story behind that bang, but I understand that you just copied that stuff from ghc-bignum and I agree it's the right thing to do to copy it verbatim.

emit . pack $ "integerToByteString: input too long (maximum is 2^"
++ (show (8 * integerToByteStringMaximumOutputLength))
++ "-1)"
emit $ "Length required: " <> (pack . show $ bytesRequiredFor input)
evaluationFailure
| otherwise = let endianness = endiannessArgToByteOrder endiannessArg in
-- We use fromIntegral here, despite advice to the contrary in general when defining builtin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, whoever wrote this comment, thank you a ton.

@@ -1800,20 +1801,23 @@ instance uni ~ DefaultUni => ToBuiltinMeaning uni DefaultFun where
(runCostingFunOneArgument . paramBlake2b_224)

-- Conversions
{- See Note [Input length limitation for IntegerToByteString] -}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not worth referencing it from here anymore, since that logic is now packed within integerToByteStringWrapper?

toBuiltinMeaning _semvar ByteStringToInteger =
let byteStringToIntegerDenotation :: Bool -> BS.ByteString -> Integer
byteStringToIntegerDenotation = byteStringToIntegerWrapper
{-# INLINE byteStringToIntegerDenotation #-}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

-- | Calculate a 'CostingInteger' for the given 'Integer'.
memoryUsageInteger :: Integer -> CostingInteger
-- integerLog2# is unspecified for 0 (but in practice returns -1)
-- ^ This changed with GHC 9.2: it now returns 0. It's probably safest if we
-- keep this special case for the time being though.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good comment.

Comment on lines 419 to 423
let actualExp = mkIterAppNoAnn (builtin () PLC.IntegerToByteString) [
mkConstant @Bool () endianness,
mkConstant @Integer () 0,
mkConstant @Integer () maxAcceptableInput
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd perhaps abstract into a mkIntegerToByteString helper or something. But it's OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Good idea.

Comment on lines 337 to 339
-- Bitwise operations
defineBuiltinTerm annMayInline 'Builtins.integerToByteString $ mkBuiltin PLC.IntegerToByteString
defineBuiltinTerm annMayInline 'Builtins.byteStringToInteger $ mkBuiltin PLC.ByteStringToInteger
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need these three lines? The two builtins are handled within the for_ enumerate block below, it should be safe to remove them from here. Please try doing that and let's see if CI agrees with my reasoning.

I wonder if we should have a test catching this situation where the same builtin is defined twice. @michaelpj do you have an opinion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could just make it throw if you redefine the same name. There's really no reason to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need these three lines?

Ooh, well-spotted. Maybe we should have some tests that make sure that you can actually use all of the builtins in Haskell code: it's quite easy to forget to add something here or in plutus-tx (although here we've added something too often).

What happens if you define something twice with different semantics here? Does the later definition override the earlier one?

Comment on lines +46 to +49
failingTests =
[
--- byteStringToInteger
"test-cases/uplc/evaluation/builtin/semantics/byteStringToInteger/big-endian/all-zeros"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It says failingTests, but all-zeros isn't a failing test? I'm confused.

Copy link
Contributor Author

@kwxm kwxm Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It says failingTests, but all-zeros isn't a failing test? I'm confused.

These do fail in the metatheory because in this branch the Agda code doesn't know about the new builtins yet. They'll be added in a PR that'll appear shortly (and which I should have done before this one). In that branch failingTests is empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you run the agda-conformance tests you get errors like

          big-endian
            all-zeros: FAIL (expected)
              Agda: unreachable code reached.
              CallStack (from HasCallStack):
                error, called at /blah/blah/blah/plutus-metatheory-0.1.0.0/build/MAlonzo/RTE.hs:44:23 in plutus-metatheory-0.1.0.0-inplace:MAlonzo.RTE (expected failure)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thank you for the explanation.

Comment on lines 1 to 3
-- Check that we can encode zero using the maximum width (8000).
(program 1.0.1
[(builtin integerToByteString) (con bool False) (con integer 8000) (con integer 0)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to make it 8192, don't forget to update this test. And I guess plenty of other ones. So maybe it's easier to just keep 8000.

Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work.

@@ -0,0 +1,4 @@
-- A bytestring consisting entirely of zeros decodes to 0.
(program 1.0.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be 1.1.0? There's no 1.0.1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. I'll just make it 1.0.0 since that's what all the others have and this doesn't depend on 1.1.0 behaviour.

@@ -603,13 +603,13 @@ bls12_381_finalVerify a b = fromBuiltin (BI.bls12_381_finalVerify a b)

-- | Convert a 'BuiltinInteger' into a 'BuiltinByteString', as described in
-- [CIP-0087](https://github.com/mlabs-haskell/CIPs/tree/koz/to-from-bytestring/CIP-XXXX).
{-# INLINEABLE integerToByteString #-}
{-# INLINABLE integerToByteString #-}
integerToByteString :: Bool -> Integer -> Integer -> BuiltinByteString
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to take an Endianness type instead of Bool, and newtype the first Integer parameter?
At the very least there should be Haddock on the parameters

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point, for the "exposed" version of the builtins we can indeed give a nicer interface again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that'd be quite helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've made the plutus-tx versions take a ByteOrder argument.


-- Make an integer of size n which encodes to 0xFF...FF
allFF :: Int -> Integer
allFF n = 256^(8*n) - 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is 256? Is it not 2^(8*n) - 1?

Copy link
Contributor Author

@kwxm kwxm Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is 256? Is it not 2^(8*n) - 1?

No, that's right. We're looking for size n here, which means 8* n bytes (thanks again to the infuriating fact that we measure sizes in 8-byte words). Since that's a number of bytes, the total number of bits is 8 * 8 * n, and 2^(8 * 8 * n) = 256^(8 * n) is 1000...000 with (64 * n) zero bits. Subtracting 1 from that removes the 1 at the front and changes all the remaining bits to 1, so you've got (64 * n) 1 bits, or (8 * n) 0xFF bytes, which is what you want. Thanks for making me check though: it's very easy to get this kind of thing wrong.

Probably the easiest way to check this is to try n=2 or something.

Comment on lines 66 to 67
-- Int (so it should be at most 2^29, which is the largest bound that GHC
-- guarantees).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think we need to worry about 32bit platforms.

@kwxm
Copy link
Contributor Author

kwxm commented Jan 26, 2024

Oh, I seem to have accidentally merged the conformance tests with this PR a few days ago when I was trying to do the opposite. That's a bit confusing.

@kwxm
Copy link
Contributor Author

kwxm commented Jan 26, 2024

Right, I'm going to merge this while the going's good.

@kwxm kwxm merged commit e3de827 into master Jan 26, 2024
4 checks passed
@kwxm kwxm deleted the kwxm/bitwise-costing branch January 26, 2024 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Builtins Costing Anything relating to costs, fees, gas, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants