-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark the production code rather than some arbitrary thing #5200
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Fixed | ||
----- | ||
|
||
- Made the `validation` benchmarks use the actual production evaluator (#5200) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,19 +3,25 @@ | |
module Common ( | ||
benchWith | ||
, unsafeUnflat | ||
, unsafeEvaluateCekNoEmit' | ||
, getEvalCtx | ||
, evaluateCekLikeInProd | ||
, peelDataArguments | ||
, Term | ||
) where | ||
|
||
import PlutusPrelude | ||
|
||
import PlutusBenchmark.Common (getConfig, getDataDir) | ||
import PlutusBenchmark.NaturalSort | ||
|
||
import PlutusCore qualified as PLC | ||
import PlutusCore.Builtin qualified as PLC | ||
import PlutusCore.Data qualified as PLC | ||
import PlutusCore.Evaluation.Machine.ExBudgetingDefaults qualified as PLC | ||
import PlutusCore.Evaluation.Machine.Exception | ||
import PlutusCore.Evaluation.Result | ||
import PlutusLedgerApi.Common (LedgerPlutusVersion (PlutusV1), evaluateTerm) | ||
import PlutusLedgerApi.Common.Versions (languageIntroducedIn) | ||
import PlutusLedgerApi.V3 (EvaluationContext, ParamName, VerboseMode (..), mkEvaluationContext) | ||
import UntypedPlutusCore qualified as UPLC | ||
import UntypedPlutusCore.Evaluation.Machine.Cek qualified as UPLC | ||
|
||
|
@@ -24,6 +30,8 @@ import Criterion.Main.Options (Mode, parseWith) | |
import Criterion.Types (Config (..)) | ||
import Options.Applicative | ||
|
||
import Control.Monad.Trans.Except | ||
import Control.Monad.Trans.Writer.Strict | ||
import Data.ByteString qualified as BS | ||
import Data.List (isPrefixOf) | ||
import Flat | ||
|
@@ -128,13 +136,32 @@ benchWith act = do | |
env (BS.readFile $ dir </> file) $ \scriptBS -> | ||
bench (dropExtension file) $ act file scriptBS | ||
|
||
unsafeEvaluateCekNoEmit' :: UPLC.Term PLC.NamedDeBruijn PLC.DefaultUni PLC.DefaultFun () -> PLC.EvaluationResult (UPLC.Term PLC.NamedDeBruijn PLC.DefaultUni PLC.DefaultFun ()) | ||
unsafeEvaluateCekNoEmit' = | ||
(\(e, _, _) -> unsafeExtractEvaluationResult e) . | ||
UPLC.runCekDeBruijn | ||
PLC.defaultCekParameters | ||
UPLC.restrictingEnormous | ||
UPLC.noEmitter | ||
getEvalCtx | ||
:: Either | ||
(UPLC.CekEvaluationException UPLC.NamedDeBruijn UPLC.DefaultUni UPLC.DefaultFun) | ||
EvaluationContext | ||
getEvalCtx = do | ||
costParams <- | ||
maybe | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bezirg do we not have a function somewhere for going from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We probably should even just so we can test that they round-trip. |
||
(Left evaluationFailure) | ||
(Right . take (length $ enumerate @ParamName) . toList) | ||
PLC.defaultCostModelParams | ||
either (const $ Left evaluationFailure) (Right . fst) . runExcept . runWriterT $ | ||
mkEvaluationContext costParams | ||
{-# NOINLINE getEvalCtx #-} | ||
|
||
-- | Evaluate a term as it would be evaluated using the on-chain evaluator. | ||
evaluateCekLikeInProd | ||
effectfully marked this conversation as resolved.
Show resolved
Hide resolved
|
||
:: UPLC.Term PLC.NamedDeBruijn PLC.DefaultUni PLC.DefaultFun () | ||
-> Either | ||
(UPLC.CekEvaluationException UPLC.NamedDeBruijn UPLC.DefaultUni UPLC.DefaultFun) | ||
(UPLC.Term UPLC.NamedDeBruijn UPLC.DefaultUni UPLC.DefaultFun ()) | ||
evaluateCekLikeInProd term = do | ||
evalCtx <- getEvalCtx | ||
let (getRes, _, _) = | ||
-- The validation benchmarks were all created from PlutusV1 scripts | ||
evaluateTerm UPLC.restrictingEnormous (languageIntroducedIn PlutusV1) Quiet evalCtx term | ||
getRes | ||
|
||
type Term = UPLC.Term UPLC.DeBruijn UPLC.DefaultUni UPLC.DefaultFun () | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be legit to just do
mkDynEvaluationContext <version> PLC.defaultCostModelParams
. Since we're trying to not include this in the benchmark run.This is also fine, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What "this"?
In any case, I wanted to stay as close to the production evaluator as possible and not do any "reasoning".
getEvalCtx
is a CAF that is forced before benchmarking starts and is used across all benchmarks, hence I don't see how it would distort the results (assuming I've fixed the forcing to go to NF rather than WHNF, but even the latter can't distort the results in a meaningful manner).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function goes through the dance of computing the integers-only version of the cost model, and then feeding it back into the function that converts it into the original form. That's good, insofar as it mirrors the production version, except that this happens in the shared "compute the evaluation context" block, and so shouldn't be included in the benchmarks anyway. So it's fine to use the faster version that just creates the evaluation context without jumping through the hoops.