Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support expanding some functions #125

Merged
merged 13 commits into from
Nov 21, 2022
Merged

Support expanding some functions #125

merged 13 commits into from
Nov 21, 2022

Conversation

ZibingZhang
Copy link
Contributor

@ZibingZhang ZibingZhang commented Nov 19, 2022

Overview

Adds a NodeTransformer that expands functions.

Details

  • Adds a expand_functions keyword argument onto get_latex which allows the user to specify the names of the methods they want to have expanded.
  • Introduces a class FunctionExpander(ast.NodeExpander) which replaces ast.Call with the equivalent expanded ast.AST.
  • Only expanded method to be supported is hypot(x, y) -> \sqrt{x^2 + y^2}.
  • Supports nested hypot calls, e.g. hypot(hypot(a, b), x, y).

Thoughts

As discussed in the issue, there may be multiple ways to expand a function. We could expose various expansion methods, and accept them in the form expanded_functions={"method1": Method1Expander1, "method2": None} where None defaults to some predetermined expansion. This is a bit gross, but we could probably make this a bit nicer with the config refactor. This would also allow users to define their own expansion methods.

There isn't a way to prevent a stack overflow right now due to f(x) -> g(x) and g(x) -> f(x). My initial thought would be to add a function_expansion_depth=1 as a keyword argument, and it would be implemented like so:

  • Initialize current_depth to 0 for each instance of FunctionExpander.
  • In visit_Call:
    • If current_depth > function_expansion_depth, then return early.
    • If any expansions are applied, increment current_depth by 1 (I'd err towards forcing users to specify if they want to try for deeper level of replacements).
      • After mapping over child nodes decrement current_depth by 1.

References

#65

Blocked by

Potentially blocked by #103.

We could merge this in before, and then figure out a config format in the other PR, or vice versa.

@ZibingZhang ZibingZhang requested a review from odashi as a code owner November 19, 2022 23:30
@google-cla
Copy link

google-cla bot commented Nov 19, 2022

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link
Collaborator

@odashi odashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very good, and sorry for putting many comments (most of them are trivial I think)

Comment on lines +300 to +301
r"\sqrt{a^{{2}} + b^{{2}}}^{{2}} + "
r"x^{{2}} + y^{{2}}}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe it'd be good to apply some symbolic optimizations since nested hypot can be removed: x 2 + y 2 2 + z 2 = x 2 + y 2 + z 2 .

This is beyond the scope of this pull request. We may need to make another issue to discuss about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would imagine that this would be a separate transformer that takes place before function expansion

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, and it would become a more complex optimizer. We could borrow some knowledge from other symbolic math tools as well.

@odashi
Copy link
Collaborator

odashi commented Nov 20, 2022

There isn't a way to prevent a stack overflow right now due to f(x) -> g(x) and g(x) -> f(x)

Nice insight. For now we can introduce functions that don't employ recursive definitions, but we need to keep in mind that this could become a problem in the future. It is better to add some TODOs to function_expander.py.

@ZibingZhang ZibingZhang requested a review from odashi November 20, 2022 10:46

args_reduced = functools.reduce(lambda a, b: ast.BinOp(a, ast.Add(), b), args)
return ast.Call(
func=ast.Name(id="sqrt", ctx=Load()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in constants.py, BUILTIN_FUNCS should be dict[enum, tuple[str, str]]?

Basically what's happening here is that we're replacing hypot with sqrt, while also relying on the fact sqrt gets further preprocessed by getting replaced with (r"\sqrt{", "}").

Having an enum rather than a bunch of strings might make the code a bit more readable, as well as helping to prevent errors with typos. In the future with more function expansions and different transformers, there might be a lot of cases where we care about specific, commonly-used functions, such as sqrt.

If you agree with changing it to an enum I'll tackle that in another issue / PR.

Copy link
Collaborator

@odashi odashi Nov 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, do you mean defining an Enum class something like:

class BuiltinFnName(str, enum.Enum):
    ...
    SQRT = "sqrt"

and use it in:

BUILTIN_FUNCS: dict[BuiltinFnName, tuple[str, str]] = {
    ...
    BuiltinFnName.SQRT: (r"\sqrt{", "}"),
    ...
}

? If I'm wrong please fix me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like that yeah

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks nicer than the current impl. I guess we finally need to adopt more sophisticated way to represent functions, but for now the idea above looks a good choice.

Copy link
Collaborator

@odashi odashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for bothering again! Overall I like this change, but there are some comments before merging.

A general suggestion: if we instantiate AST nodes, it is preferred to explicitly specify parameter names for every argument. AST structure is sometimes updated, and without parameter names it may lack version compatibility (not only currently supported versions, but also versions in the future, in which we couldn't expect what kind of changes will be introduced)

Comment on lines 154 to 162
(ast.Call(ast.Name("hypot"), ast.Constant("foo")), "hypot"),
(
ast.Call(
ast.Attribute(ast.Name("math"), "hypot", ast.Load()),
ast.Constant("foo"),
),
"hypot",
),
(ast.Call(ast.Constant(123), ast.Constant("foo")), None),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • These Calls are illegal because the args parameter should take a list of expr.
  • The func parameter could take any expr, but in this case the constant 123 is never populated here in the actual AST. I guess some other example is preferred, e.g., ast.Call(func=ast.Name("foo"), args=[]), which represents foo()(...)

@ZibingZhang
Copy link
Contributor Author

Sorry for bothering again! Overall I like this change, but there are some comments before merging.

No worries at all! Thanks for taking the time to review :) I think this is a really cool project.

I'll take a look at this sometime later today / tomorrow. Do you have any recs for another issue for me to look at after this one? It seems like quite a few of them are blocked my issues that have existing PRs, but I'd love to work on something else if I've got free time this week

@odashi
Copy link
Collaborator

odashi commented Nov 21, 2022

@ZibingZhang I think this PR is almost independent from others except #103 which is an essential change. If you could take your time for that PR it's welcome.

@odashi
Copy link
Collaborator

odashi commented Nov 21, 2022

I also think adding supports for other functions in math and numpy can be processed independently (but blocked by this issue #125).

@ZibingZhang
Copy link
Contributor Author

A general suggestion: if we instantiate AST nodes, it is preferred to explicitly specify parameter names for every argument.

Good point, will keep this in mind moving forward.

I also think adding supports for other functions in math and numpy can be processed independently (but blocked by this issue #125).

That's a good point. I'll start working through that backlog.

@ZibingZhang ZibingZhang requested a review from odashi November 21, 2022 09:01
Copy link
Collaborator

@odashi odashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really good change, thanks!

@odashi odashi added the feature label Nov 21, 2022
@odashi odashi added this to the v0.3 milestone Nov 21, 2022
@ZibingZhang ZibingZhang deleted the main branch November 22, 2022 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants