Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RPC] f-string-like Cryptol quasiquotation #1307

Merged
merged 10 commits into from
Nov 29, 2021
Merged

[RPC] f-string-like Cryptol quasiquotation #1307

merged 10 commits into from
Nov 29, 2021

Conversation

m-yac
Copy link
Contributor

@m-yac m-yac commented Nov 20, 2021

This PR adds the function cry_f for quasiquoting cryptol using an f-string-like syntax. The example from GaloisInc/saw-script#1188 can now be written as:

a = cry('g x')
b = cry_f('foo {a}')

This PR also adds cry_eval_f and Connection.eval_f which behave the same as composing cry_eval (or Connection.eval) with cry_f. See tests/cryptol/test_quoting.py for more examples.

Under the hood, cry_f uses Python's f-string parser and converts all of the bracketed arguments it is given into Cryptol syntax. This enables it to handle more complex expressions, like:

cry_f('{ {"x": BV(size=8, value=1)} }')
cry_f('{{x = {BV(size=8, value=1)}}}')

both of which produce cry('{x = 0x01}').

In making this PR I realized cryptoltypes.py needs a lot of cleaning up. I pulled out most of the changes I made to that file along the way and will be making a separate PR for that soon.

@RyanGlScott
Copy link
Contributor

Fantastic work, @m-yac! I feel like my ability to review this will be hampered by my limited understanding of the underlying Python tricks, but here are some questions/comments I have after skimming through the patch:

  • If I understand correctly, the main advantage of using cry_f/eval_f over Python's usual string formatting is that the former will automatically parenthesize quasiquoted arguments. If so, where is that bit implemented? Are there other advantages to this approach that I've overlooked?

  • Speaking of which, it seems like combining Python's default string formatting with Cryptol should be discouraged, given the perils. It would be worth calling out this point a bit more loudly in the documentation, I think, as that is a footgun that is easy to overlook. I'm not sure where exactly would be the best place to call this out, but somewhere.

  • Since we are in the business of creating our own quasiquoter, what do you think about picking a different syntax from {...}? The reason I bring this up is because Cryptol already uses braces to denote explicit type applications (e.g., length`{42}), so Python users will have to remember to type length`{{42}} (with double braces) to avoid confusion with the quasiquoting syntax.

    I'm not sure how difficult this would be to implement. If it proves impractical to do so, then we should at least document this as another potential pitfall.

@m-yac
Copy link
Contributor Author

m-yac commented Nov 22, 2021

Thanks @RyanGlScott! These is super helpful questions and pieces of feedback.

  • If I understand correctly, the main advantage of using cry_f/eval_f over Python's usual string formatting is that the former will automatically parenthesize quasiquoted arguments. If so, where is that bit implemented? Are there other advantages to this approach that I've overlooked?

Each of the quoted arguments in a cry_f are passed to to_cryptol_str to convert them to strings, then after all the strings are put together, the result is wrapped in a CryptolLiteral.

So, for your first question, the parentheses are added in the CryptolLiteral case of to_cryptol_str.

For your second question, the other big advantage is that to_cryptol_str will also ensure other python values are represented as valid Cryptol syntax. Specifically dicts, BVs, and strings do not give valid Cryptol syntax when passed to str(), so they have special cases in to_cryptol_str. For example:

>>> f'{ {"x": BV(size=8, value=1)} }' == '{"x": BV(8, 0x01)}' # not valid Cryptol syntax
True
>>> cry_f('{ {"x": BV(size=8, value=1)} }') == cry('{x = 0x01}') # valid Cryptol syntax
True
  • Speaking of which, it seems like combining Python's default string formatting with Cryptol should be discouraged, given the perils. It would be worth calling out this point a bit more loudly in the documentation, I think, as that is a footgun that is easy to overlook. I'm not sure where exactly would be the best place to call this out, but somewhere.

Ah, by this do you mean specifically saying to always using cry_f-strings and never f-strings when formatting Cryptol? In which case I agree. As for where to put it, definitely in the docstring for cry_f, but perhaps also it should also go in the README? If this feature is going to end up getting used a lot it might be worth highlighting it and discussing pitfalls.

  • Since we are in the business of creating our own quasiquoter, what do you think about picking a different syntax from {...}? The reason I bring this up is because Cryptol already uses braces to denote explicit type applications (e.g., length`{42}), so Python users will have to remember to type length`{{42}} (with double braces) to avoid confusion with the quasiquoting syntax.

    I'm not sure how difficult this would be to implement. If it proves impractical to do so, then we should at least document this as another potential pitfall.

Dang it, good point! Unfortunately, since I'm using reflection to make this work, I can't think of an easy way to change what characters delimit f-string arguments. Perhaps we could do a find-and-replace swapping braces and some other pair of characters before the reflection, then swap again after? But what character pair would we use to replace braces? – square brackets and parentheses have even more conflicts in cryptol syntax.

Regardless, I agree that this should be mentioned as a pitfall. If we're assuming our users are already fairly familiar with Python, then they should be used to adding double-braces in the context of f-strings, but I have a feeling that's not a valid assumption.

@RyanGlScott
Copy link
Contributor

So, for your first question, the parentheses are added in the CryptolLiteral case of to_cryptol_str.

Got it, that clarifies things.

Ah, by this do you mean specifically saying to always using cry_f-strings and never f-strings when formatting Cryptol?

Correct.

As for where to put it, definitely in the docstring for cry_f, but perhaps also it should also go in the README? If this feature is going to end up getting used a lot it might be worth highlighting it and discussing pitfalls.

If we had some kind of high-level tutorial for the Cryptol Python bindings, that would be a good place to mention it. From a quick search over the various READMEs in the repo, I'm not sure if such a tutorial exists. In any case, I'd be content with mentioning this in the docstring for cry_f.

Unfortunately, since I'm using reflection to make this work, I can't think of an easy way to change what characters delimit f-string arguments. [...] Regardless, I agree that this should be mentioned as a pitfall.

Alright, I'd be fine with simply mentioning this point in cry_f's docstring.

@m-yac
Copy link
Contributor Author

m-yac commented Nov 24, 2021

Ok @RyanGlScott I incorporated your suggestions into the docstring for cry_f, let me know what you think!

Note that I also improved the way cry_f/cry_eval_f handles conversions (e.g. {...!r}) and format specs (e.g. {...:+.2f}) to ensure that the output is always valid cryptol.

Copy link
Contributor

@RyanGlScott RyanGlScott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic stuff. I have a fewer minor comments about documentation, but other than that, this LGTM.

cryptol-remote-api/python/CHANGELOG.md Outdated Show resolved Hide resolved
cryptol-remote-api/python/cryptol/quoting.py Outdated Show resolved Hide resolved
@m-yac
Copy link
Contributor Author

m-yac commented Nov 29, 2021

I realized I should probably not use __str__ directly for all this, so all this latest commit adds is a function __to_cryptol_str__ to CryptolJSON which mirrors __to_cryptol__ (as well as a better type to __to_cryptol__ to help make the difference between the two functions clearer).

@pnwamk
Copy link
Contributor

pnwamk commented Nov 29, 2021

This is awesome -- nice job!

The nasty example I was thinking of being nice to clean up some day is the salsa20 example from saw-script, i.e. lines like these:

res = cryptol("quarterround")([y0, y1, y2, y3])
self.points_to(y0_p, cryptol("(@)")(res, cryptol("0")))

Am I correct in thinking these can be rewritten as follows with these changes?

res = cry_f('quarterround {[y0, y1, y2, y3]}')
self.points_to(y0_p, cry_f('{res}@0'))

@m-yac
Copy link
Contributor Author

m-yac commented Nov 29, 2021

Thanks all! Yes, @pnwamk that's exactly right. Though the SAW bindings will need to define its own versions of cry and cry_f which wrap their respective results in CryptolTerm (which is just CryptolLiteral + SetupVal) to get them to play nice with the rest of the SAW client. I'm working on adding those and updating the SAW tests right now.

@m-yac m-yac merged commit abb41f5 into master Nov 29, 2021
@m-yac m-yac deleted the rpc/quoted-cryptol branch November 29, 2021 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants