Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/fix: revised canonical forms; tests; preliminary & associated fixes #238

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

samueltlg
Copy link

Still more work & tests incoming - which is done but needs verification & review - for CanonicalForms Multiply, Divide & Number (tests). Power form most notably up for now: this being the trickiest on

If you review this set of changes, will make any requested changes along with the next/remaining batch of work: this should be next Wed./Thu. evening.

The individual commit messages are good sources of info. WRT changes.

Have a batch of incoming code-comments to make (will do tomorrow) which will likely clarify, and sidestep confusion, and highlight key-points/requests.

Some outstanding queries:

  • WRT symbols being involved in canonicalization (notably, 'Power'):
    • Only recently came to mind that the holdUntil symbol attribute is not accounted for during the check on operand values (which could potentially be symbols): should this be the case?
      • Yet furthermore, if this is to be checked, i.e. that the 'holdUntil' value of a symbol-definition is never, then would it not be the case that symbols are substituted with values, before canonicalization, either partial or full, takes place? That being said, it would be the case that accounting for symbol values during application of canonical-forms, is unnecessary, since they would be substituted beforehand anyway (and any existing symbols would therefore have a 'holdUntil' value of 'evaluate', 'N', etc...)
    • With the above in mind, in the case of Power-form, for example, should operations such as 1^x, where x is declared as a constant of value 1, ever take place... ?

samueltlg added 14 commits April 7, 2025 01:11
- Fixes the return value of 'sgn' for BoxedNumbers with a value of
  'NaN': as a corollary fixes & now correctly computes return-value of
  getter `BoxedSymbol.isNaN`, too.

- Fix: for consistency, 'isNaN' & 'isInfinity' now trigger definition
  binding (canonicalization), in similar manner to 'isOdd', 'isEven',
  'isInteger'...  (This is appropriate because these may be considered
  direct 'value inquiry' getters, similarly to these sibling properties)

- Adds missing getter 'isFinite': computing the result using 'sgn' like
  its sign-checking siblings 'isPositive', 'isNegative'...

- Fix: several properties/getters now return a boolean in cases where
  'undefined' should be returned (because, the symbol is unbound and
  therefore not enough information is known). ^Notably, this is
  restricted to the aforementioned - 'sgn' referencing - properties
  'is(NaN/Infinity)' et cetera.

- Fix: revise getter 'isInfinity' to now account for 'complex-infinity'

Also includes various extra doc., including some corrections
Before:
 - ce.parse('\\sqrt{x}').root(3) -> '\\sqrt[5]{x}'
Now
 - '\\sqrt[6]{x}
(Also, re-declares getter 'value' for BoxedNumbers, for purposes of
refining its return-type)
'numberForm' now longer simply requests the 'canonical' version/property
of boxed-numbers: these notably now being obsolete on account of
BoxedNumber instances always now being being required to be constructed
canonically.

Instead, the 'structural' number-cast operations present in
'boxFunction' - such as casting 'Rational' as a number where appropriate
- are co-present in this function: albeit applicable more
narrowly to 'BoxedExpression' (only).

Prior to this change, for a while now, 'numberForm' has therefore been
redundant.

Notably also, 'boxFunction' now applies these operations during full
canonicalization only, and not when just the 'Number' canonical-form is
requested.
- Fixes sign comp. functions 'is(Non)Positive/Negative' to account for
  wider range of 'Sign' (type) values: and to also *consistently return
  'undefined'* when comparing against complex-number signs.
  Consequently, corresponding BoxedExpr methods now correctly compute
  the sign for a wider range of values: particularly complex numbers
  (including ComplexInfinity), and in some cases zero.

  ^This fix may also be consider a *breaking* change, in that
  'isPositive/Negative' et cet. will return 'undefined' for boxed
  complex-numbers (i.e. indiciating inapplicability): whereas before
  these would return 'false'.

- Fixes, via broadening range of returned values, 'sgn' for
  BoxedNumbers: this now accounting for all Infinity sign values.
  Naturally, this also affects computation of 'sgn' for symbols with
  values.
  An '...-infinity' sign value is now correctly returned for +/-/~
  Infinity for both BoxedNumbers, and symbols with these values.
… FN's now -> undefined; rectify BoxedFunction.isPure

- WRT repeat evaluation of impure FN's (such as 'Random'):
 -Before: `expr.evaluate/N() -> Result1, Result1, Result1...` (i.e.
 essentially behaving as idempotent).
 -After:  `  ...             -> Result1, Result2, Result3...` (new
 results unrelated to previous)

-`get value` (i.e. 'N().valueOf()') for boxed impure functions now
returns 'undefined'.
 This contrasts from before in both senses of now returning 'undefined'
 instead of computing a value, and also by returning the same,
 first-call cached value on reach request.

- 'isPure' (BoxedFunction) now returns a non-undefined/boolean result
  for _non-canonical_/unbound functions.
Fundamental fixes:
- 'powerForm' now calls 'canonicalPower', instead of 'a.pow(b)'
- 'canonicalPower'
 - No longer casts passed args./exprs. as fully canonical
 - Generally, no longer performs operations wherein an operand, esp. the
   exponent, is a function expression: e.g. '1^(2+3)' will no longer
   simplify (but '1^5' still will do so).
    -^There are reasonable exceptions, e.g. 'a^1' will always simplify
    -^As will 'a^{-1}'
 - Values, including those temporarily assigned from assumptions, are no
   longer looked to from symbols unless these are *non-constant*.
   (Before, inadvertently, this could be the case)
 - More consistently, checks for _constant_ symbol values for
   operations, such as whether 'b' is '1' in 'a^b'
    -**warning**: this may be *incorrect* behaviour: since the
    'holdUntil' attribute is not looked to (and this may be desirable).
    Furthermore, if this is an oversight, then it may be the case that
    symbols with a 'never' holdUntil value will be substituted with
    their values before canonicalization/canonical-forms: therefore
    accounting for symbol values here, or within any other
    canonical-form, may be unnecessary & inappropriate.
 - One or two operation fixes, here & there (forgotten now)
 - More careful type-checking throughout: i.e. particularly
   discriminating between 'real' & 'non-finite' numbers; in some cases
   'complex', too...

Changes:
- Should be, significantly more optimized
- !Currently, does not include the collapsing of multi-level exponents, e.g.
  '{a^b}^{c} -> a^{b*c}'
- Heavier with inversion
- Several instances of accounting for wider ranges of values (mainly for
  exponents): particularly for 'a^Infinity'.
-

Notes:
- Now, canonicalPower only returns a new boxed-function with 'canonical:
  true' when given args. are canonical.
- !Function 'pow' - the first half more or less - is a replication of
  what now takes place within canonicalPower: but is less thorough, and
  may contain a few inaccuracies. Likely, would benefit from replacing
  its first half with a call to 'canonicalPower'; which it anyway
  replicates.
This reverts commit a8f85fa.

This property/getter is largely unnecessary: because 'value' largely is  'constantValue' (the exception is for symbols: in which for non-constants its value may vary arbitrarily, or with assumptions).

Note:
- This *could* be introduced, perhaps exclusively for symbols; or its overall definition could be adjusted slightly such that it is useful for boxed-functions too.
I.e., for a boxed-function, only return a value if 'isConstant' && 'isPure'. ('value' differs in that returns non-undefined if 'isPure', *only*...)
@@ -97,15 +99,41 @@ export function simplify(latex: string): string {
return exprToString(engine.parse(latex)?.simplify());
}

export function checkJson(inExpr: SemiBoxedExpression | null): string {
type ExprVariant =
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up not using this (as implemented in 'checkJson') in the end, due to using a custom solution in canonical-form.test.ts which prints a 'canonical-forms' variant: but conceivably, could see myself as having a use for this.
If not your cup of tea, do do away!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thumbs-up 👍; happy that this to remain.

@@ -437,7 +440,8 @@ export class BoxedSymbol extends _BoxedExpression {
// Widen the type, if it was previously inferred
if (
def instanceof _BoxedSymbolDefinition &&
(def.inferredType || def.type.isUnknown)
(def.inferredType || def.type.isUnknown) &&
!def.isConstant //'constant' symbols may still be inferred (i.e. value given but no type)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility may be raising an exception, if attempt to infer for a constant-symbol ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid throwing an exception, although I would have expected that a constant would have had its type set when declared (the type being "inferred" based on the value at the time if necessary).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e., that this added condition should be unnecessary, because if declaring/assigning a symbol as a constant (with a value), inferredType should jointly be set to false ?

@@ -1048,12 +1048,24 @@ export interface BoxedExpression {
*
* Infer the type of this expression.
*
* If the type of this expression is already known, return `false`.
* Effective only for overall BoxedExpression types which are *non-constant* and therefore for
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do please verify, or amend.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for functions, narrows the *(return) type*, may be incorrect

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that's correct

@@ -414,10 +414,13 @@ export class BoxedSymbol extends _BoxedExpression {
}

/**
*
* Subsequent inferences will narrow the domain of the symbol.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... will narrow: this is outdated now ?

Copy link
Member

@arnog arnog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great PR. Looks good overall. See comments, but the thing that bothers me the most is that some of the sign properties are for complex values are undefined, while they were false before. I'm not sure if that was an intentional change or a side effect.

The other thing, which is more of a heads up, is that the definition of Sign is a bit messy right now, as it includes overlapping concepts, such as the type, whether something is finite or not, etc... In general, I think it would be clearner and simpler if there were fewer "kinds" of sign. There is a pending PR that has those changes, I believe, and I'll have to check on its status and if it could be merged.

* @inheritdoc
*/
get value(): number | boolean | string | object | undefined {
if (!this.isPure) return undefined;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of expr.value is that it was a shortcut for expr.N().valueOf().

Could you elaborate why non-pure functions should return undefined?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, this is a good one.

After having developed the idea of property constantValue, and jointly having experimented around a lot with boxed-expr's. in the debug console, it became apparent that the current behaviour of 'value' is/was confusing: particularly with the former value caching behaviour.

In my mind, the result of value would be less confusing, or more reliable if it remained static (unless the state of the pertinent engine were to change in the meantime, particularly symbol values).

Making immediate repeated requests to an impure function expression's value & receiving different results I experienced to be non-ideal, at minimum. e.g., for a contiguous block:

result = BoxedRandomExpr.value; // 0.1265891823872
result = BoxedRandomExpr.value; // 0.8857823916674

Experienced it to be sensibly the case if expr.N().valueOf() were called, but for requesting a static-like property such as value, felt more natural to leave this to simply return 'undefined' if it were to potentially vary each time (i.e. is an impure function-expression).
This behaviour is also more aligned with 'value' acting like constantValue (now removed).
If this change were to remain, then this should make it the case that all boxed-expr. types reliably return a 'fixed' value, provided that the state of the engine doesn't change.

Also of note, the original doc. of BoxedExpression.value states:
return a JavaScript primitive representing the value of this expression

I guess that result of evaluating an impure expression could be considered as 'representing the value' of an expression, but is at least a bit confusing: with it not being a fixed one.
Personally thought that this change made sense, particularly in a pragmatic sense, when either programmatically or otherwise handling impure function (expressions).

}

// root(root(a, b), c) -> root(a, b*c)
if (this.operator === 'Root') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to handle the case at line 693 as well...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block should stand alone after correcting above, same as before

@@ -839,6 +848,9 @@ export class BoxedFunction extends _BoxedExpression {
}

evaluate(options?: Partial<EvaluateOptions>): BoxedExpression {
// If this function is not pure, then bypass caching (i.e. saved as this._value, this._valueN):
// since the result potentially could differ for each computation)
// if (!this.isPure) return this._computeValue(options)();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that sounds like it would make sense...

@@ -364,12 +380,21 @@ export class BoxedNumber extends _BoxedExpression {

let s: number | undefined;
if (typeof this._value === 'number') {
if (Number.isNaN(this._value)) return 'unsigned';
if (Number.isNaN(this._value)) return 'nan';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, the possible values of Sign should be reduced, in particular removing positive-infinity, nan, real-not-zero, etc... There is a pending PR to clean that up, but it has not been merged yet. It's fine to leave this as is for now, but FYI.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, good to know.

negative (<0): false
nonPositive (<=0): false
nonNegative (>=0): false
positive (>0): undefined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I would expect false

negative (<0): false
nonPositive (<=0): false
nonNegative (>=0): false
positive (>0): undefined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect false as well.

negative (<0): false
nonPositive (<=0): false
nonNegative (>=0): false
positive (>0): undefined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect false

@@ -97,15 +99,41 @@ export function simplify(latex: string): string {
return exprToString(engine.parse(latex)?.simplify());
}

export function checkJson(inExpr: SemiBoxedExpression | null): string {
type ExprVariant =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@@ -335,6 +335,7 @@ N-mach = [3.00416602394643,7.3890560989306495,54.59815003314423]

exports[`EXP Exp -1 1`] = `
box = ["Exp", -1]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This revised form consistently performs x^{-1} -> 1/x as per the spec. Reasonable here?

@@ -120,9 +120,9 @@ describe('POWERS/ROOTS', () => {
expect(check('x+(-1)^2')).toMatchInlineSnapshot(`x + (-1)^2`);
});
it('should serialize other powers', () => {
expect(check('x^{0}')).toMatchInlineSnapshot(`1`);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a serialization case, but prior, since check fully canonicalizes, matched against the canonicalized result x^0 -> 1.
x^0 no longer simplifies because x is unknown.

Pending review of the 'holdUntil' (SymbolDefinition) role during canonicalization, it may eventuate that x^0 -> 1 should maybe never apply during canonicalization, regardless of 'constant' status of 'x'.

expect(check('x^{1}')).toMatchInlineSnapshot(`x`);
expect(check('x^{-1}')).toMatchInlineSnapshot(`x^(-1)`);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar, now consistent, conversion case

@@ -258,7 +258,7 @@ describe('PRECEDENCE', () => {
expect(check('1+2^3')).toMatchInlineSnapshot(`1 + 2^3`);
expect(check('(1+2)^3')).toMatchInlineSnapshot(`(1 + 2)^3`);
expect(check('1^2+3')).toMatchInlineSnapshot(`1 + 3`);
expect(check('1^{2+3}')).toMatchInlineSnapshot(`1`);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this is reasonable: but up to you if you think otherwise

* Sign `s` is > 0.
*
* :::info[Note]
* Returns `undefined` for cases where the given sign is either non-applicable to real numbers
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This consistently the case now: believe that this is the desired behaviour?
As a consequence, isPositive/Negative & variants (for BoxedNumbers) now return undefined for complex-numbers.

@@ -437,7 +440,8 @@ export class BoxedSymbol extends _BoxedExpression {
// Widen the type, if it was previously inferred
if (
def instanceof _BoxedSymbolDefinition &&
(def.inferredType || def.type.isUnknown)
(def.inferredType || def.type.isUnknown) &&
!def.isConstant //'constant' symbols may still be inferred (i.e. value given but no type)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e., that this added condition should be unnecessary, because if declaring/assigning a symbol as a constant (with a value), inferredType should jointly be set to false ?

* Effective only for overall BoxedExpression types which are *non-constant* and therefore for
* which its value, and thereby type, can potentially vary.
*
* For symbols, inference may take place only for undeclared, or previously inferred symbols. For
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will ensure to amend that.

* definition if this is an _undeclared_ boxed-symbol), and for functions, narrows the *(return)
* type*. The return result will for this case be `true`.
*
* (Note that subsequent inferences can be made and will override previous ones if valid)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; will update

@@ -364,12 +380,21 @@ export class BoxedNumber extends _BoxedExpression {

let s: number | undefined;
if (typeof this._value === 'number') {
if (Number.isNaN(this._value)) return 'unsigned';
if (Number.isNaN(this._value)) return 'nan';
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, good to know.

negative (<0): false
nonPositive (<=0): false
nonNegative (>=0): false
positive (>0): undefined
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realised that I forgot to publish the review yesterday; leaving only several isolated comments.

This change was intentional (if you see message for #238 8502b34) , on the basis that:

  • Generally,'undefined' (or null) is returned for boxed-expr. properties in which the property is inapplicable (e.g. sgn & other numeric properties for BoxedString; symbol for non-symbols; various properties, if the containing expr. is non-canonical...)
  • Therefore, since sign-ness does not apply to complex-numbers, at least in the traditional sense, then these properties are falling under category of 'inapplicability' and return undefined.

Indeed, a boolean could be returned in-place of 'undefined', but it appeared intuitive to me that this be returned here. Potentially, could save on some errors: if a complex number returned isPositive: false, and also jointly, isNonPositive: false, could see this being a bit confusing - particularly if the type of number was not checked prior. At least, 'undefined' circumvents cases such as that, and more-so follows prior rule.
Should be an easy reversion to 'undefined' though if you prefer the previous behaviour: will revert back!

@samueltlg
Copy link
Author

This is a great PR. Looks good overall. See comments, but the thing that bothers me the most is that some of the sign properties are for complex values are undefined, while they were false before. I'm not sure if that was an intentional change or a side effect.

The other thing, which is more of a heads up, is that the definition of Sign is a bit messy right now, as it includes overlapping concepts, such as the type, whether something is finite or not, etc... In general, I think it would be clearner and simpler if there were fewer "kinds" of sign. There is a pending PR that has those changes, I believe, and I'll have to check on its status and if it could be merged.

Thanks . Have responded to initial comments; noticed that I didn't publish a review yesterday, meaning that they are bundled with a series of unpublished comments from then, too. Will also now add a small quantity of comments left to be added from yesterday.
Regardless of any upcoming amendments to the set of sgn changes, in general keep these changes as an interim fix? (fixes several, if you see #238 8502b34)

Think that, after resolving any current changes, & also pushing up the test-cases for Number, will add the changes for remaining forms such as 'Multiply','Divide', in a separate request; stop this one becoming bloated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants