Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink order of operations (multiplication/division) #76

Open
printfn opened this issue May 20, 2021 · 8 comments
Open

Rethink order of operations (multiplication/division) #76

printfn opened this issue May 20, 2021 · 8 comments
Labels
bug Something isn't working

Comments

@printfn
Copy link
Owner

printfn commented May 20, 2021

Consider the three operators:

  • multiplication *
  • division /
  • function application/implicit multiplication/implicit addition (used in sin pi, 3 kg, (2)(4) or 1 1/2).

Current these are all parsed mostly left-to-right, with mostly the same precedence (and a bunch of special cases).

These calculations are correct and should continue working as they do now:

  • 1 3/8 inches
  • 1/2 kg
  • 3pi/2 radians to degrees
  • -sin (pi/2)
  • 5 feet 2 inches

These examples don't do what the user might expect:

  • 1 meter / 5 seconds (parsed as (1 meter / 5) * seconds)
  • sin pi/2 (parsed as (sin pi)/2)

This calculation can't be parsed at all:

  • 3sin 5 (should be equivalent to 3 * (sin 5))

It would be nice to come up with a consistent solution for all these expressions, preferably not involving whitespace.

@printfn
Copy link
Owner Author

printfn commented May 20, 2021

Also note #62

@probablykasper
Copy link
Contributor

Allowing sin pi/2 and sin pi / 2 feels too ambiguous imo

@yurivict
Copy link

Number followed by the unit name should have the highest precedence, followed by *, /, and then +, - operations.

@mattfbacon
Copy link

@yurivict That would not work with 3/8 inches; it would cause it to be parsed as 3/(8 inches). I think, though, that if we see 3 mm / 4 seconds, this should cause the behavior you described, i.e., check for a unit immediately before the binary operator, and if it's present, then group units more tightly than the binary operator for whatever comes after the binary operator.

@mattfbacon
Copy link

The previous approach mostly works well, except it breaks for 3i/2 m, giving m^-1 instead of m because it is grouped as 3i/(2 m) by the rule checking for an Apply before the binop.

I don't think there's any way to resolve this elegantly within the parser, because if we replace i with an actual unit like inches (i.e., 3inches/2 m), then the parser grouping as (3 inches) / (2 m) is actually correct.

I don't know if this means we need to special case certain identifiers like i. Regardless we will need to modify the result formatter to parenthesize the amount in these cases.

I'm glad that the unit tests check the output of the evaluation again, otherwise I wouldn't have caught this!

@mlcui-corp
Copy link
Contributor

Looking into this, I think there are (at least) two reasonable approaches to address this:

  • Make parsing/lexing whitespace sensitive. With this approach, the user can express intent using the query itself:
    • 3i/2 m would be interpreted as (3 i / 2) m.
    • 3inches/2 m would be interpreted as (3 inches / 2) m (length^2)
    • 3/2 m would be interpreted as (3 / 2) m.
    • 3 / 2m would be interpreted as 3 / (2 m).
    • 3 / 2 m, 3 inches / 2 m and 3 i / 2 m are still ambiguous. Most people would assume that they parse to (3 / 2) m, (3 inches) / (2 m) and (3 i / 2) m respectively, but there's no way of differentiating this during parsing.
  • Make parsing ident sensitive. That is, differentiate whether idents are functions, unitless constants, units, etc. before parsing tokens into an AST. This is similar to @yurivict's proposal.
    • 3 i / 2 m would be interpreted as (3 i / 2) m.
    • 3 inches / 2 m would be interpreted as (3 inches) / (2 m) (unitless).
    • 3 / 2 m would be interpreted as (3 / 2) m.
    • 1 m 3 sin 5 cm could be parsed as (1 m) + (3 sin 5) cm.

I think the latter would make the most sense as it addresses the ambiguous cases, but it would slightly complicate parsing with an extra step.

@mattfbacon
Copy link

Yeah I think that's a good idea and shouldn't be that hard since the environment should be available at that stage.

@probablykasper
Copy link
Contributor

probablykasper commented Oct 24, 2024

I think the most common/expected way to go about this is to prioritize implicit multiplication. Then you have some specific special cases when necessary

  • 1/2 kg = 1/(2 kg)
  • 3pi/2 radians to degrees = (3pi)/(2 radians to degrees)
  • -sin (pi/2) = -(sin (pi/2))
  • 1 meter / 5 seconds = (1 meter) / (5 seconds)
  • sin pi/2 = (sin pi)/2. You mentioned users might not expect this, but it would be consistent and it's what Google Search does. The alternative would be whitespace sensitivity, but that can be dealt with later
  • 1 3/8 inches = special case (1 3/8) inches.
  • 5 feet 2 inches = special case 5 feet + 2 inches
  • 3sin 5 = special case 3 (sin(5))

To me those three special cases intuitevely feel like special cases too. 1 3/8 is a special way of writing a number, so I would not expect 1*2 3/8*2. Restricting these special cases to only allowing the basic simple syntax prevents complex/ambiguous situations where a user gets an answer that doesn't line up with their expectations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants